[ 
https://issues.apache.org/jira/browse/AVRO-680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Goyal updated AVRO-680:
------------------------------
    Attachment: PERF_8000_cycles.zip

{quote}
It would be great to have something that took a pair of runs and identified 
those tests that varied by more than a few percent. 
{quote}
I have created such a utility in Tcl and its attached in the latest zip 
PERF_8000_cycles.zip

\\
\\
{quote}
Then we could re-run just those to check whether it was spurious. We also might 
try increasing CYCLES to see if that decreases the inter-run variations.
{quote}
Here is a sample run using the above utility (Data corresponding to this is 
also there in the zip attachment.)
Cycles' count is increased 10 times to 8000
{code}
% tclsh perf_processor.tcl all/orig.txt all/nsmk.txt 
Finding tests where time is 1.05 times more than the original
             orig.txt   nsmk.txt
GenericWrite: 34552    37893
StringWrite: 60999    65273
GenericOneTimeDecoderUse_Read: 38315 40928

// Run only the above tests for patched and unpatched code
% mvn exec:java -Dexec.mainClass=org.apache.avro.io.Perf 
-Dexec.classpathScope="test" -Dexec.args="-G -s -Gotd"

% tclsh perf_processor.tcl reduced/orig.txt reduced/nsmk.txt
Finding tests where time is 1.05 times more than the original

% 
{code}

\\
\\
In summary, the tests with and without the patch do not show a variance of more 
than 5%
Also, the less-than-5% performance decrease is *not* always on the patched-code.
It is more or less equally distributed on the patched as well as the unpatched 
code.

> Allow for non-string keys
> -------------------------
>
>                 Key: AVRO-680
>                 URL: https://issues.apache.org/jira/browse/AVRO-680
>             Project: Avro
>          Issue Type: Improvement
>    Affects Versions: 1.7.6, 1.7.7
>            Reporter: Jeremy Hanna
>         Attachments: AVRO-680.patch, AVRO-680.patch, PERF_8000_cycles.zip, 
> isMap_Call_Hierarchy.png, non_string_map_keys.zip, non_string_map_keys2.zip, 
> non_string_map_keys3.zip, non_string_map_keys4.patch, 
> non_string_map_keys5.patch, non_string_map_keys6.patch, 
> non_string_map_keys7.patch, non_string_map_perf.txt, 
> non_string_map_perf2.txt, original_perf.txt
>
>
> Based on an email thread back in April, Doug Cutting proposed a possible 
> solution for having non-string keys:
> Stu Hood wrote:
> > I can understand the reasoning behind AVRO-9, but now I need to look for an 
> > alternative to a 'map' that will allow me to store an association of bytes 
> > keys to values.
> A map of Foo has the same binary format as an array of records, each
> with a string field and a Foo field.  So an application can use an array
> schema similar to this to represent map-like structures with, e.g.,
> non-string keys.
> Perhaps we could establish standard properties that indicate that a
> given array of records should be represented in a map-like way if
> possible?  E.g.,:
> {"type": "array", "isMap": true, "items": {"type":"record", ...}}
> Doug



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to