> On Sept. 4, 2013, 11:36 p.m., Julien Le Dem wrote:
> > src/org/apache/pig/builtin/StreamUDFToPig.java, line 74
> > <https://reviews.apache.org/r/13781/diff/1/?file=344447#file344447line74>
> >
> >     decoding string is actually quite costly as java strings are utf-16 and 
> > creating a string always copies the content of the byte array.
> >     Can't you just use the binary format of the type?

I'm not sure how to use the binary format without significantly rewriting the 
serialization logic on the python side to serialize the data in a byte format 
(using something like struct.pack).   Is that what you were thinking of, or 
were you thinking of something else?

The last profiling we did showed that the bottleneck is typically on the python 
side deserializing the input from Pig.  I think any performance enhancements at 
this point should be done in a new jira with solid profiling over a number of 
test cases.


> On Sept. 4, 2013, 11:36 p.m., Julien Le Dem wrote:
> > src/org/apache/pig/builtin/StreamUDFToPig.java, line 98
> > <https://reviews.apache.org/r/13781/diff/1/?file=344447#file344447line98>
> >
> >     would StreamTokenizer help ?

I took a look, and I don't think its worth re-writing this logic using 
StreamTokenizer.


> On Sept. 4, 2013, 11:36 p.m., Julien Le Dem wrote:
> > src/org/apache/pig/builtin/StreamUDFToPig.java, line 109
> > <https://reviews.apache.org/r/13781/diff/1/?file=344447#file344447line109>
> >
> >     if you know the size in advance you can pass it here and avoid resizing

Done.


> On Sept. 4, 2013, 11:36 p.m., Julien Le Dem wrote:
> > src/org/apache/pig/builtin/StreamUDFToPig.java, line 123
> > <https://reviews.apache.org/r/13781/diff/1/?file=344447#file344447line123>
> >
> >     you probably want to do getInstance() once

Done.


> On Sept. 4, 2013, 11:36 p.m., Julien Le Dem wrote:
> > src/org/apache/pig/builtin/StreamingDelimiters.java, line 74
> > <https://reviews.apache.org/r/13781/diff/1/?file=344448#file344448line74>
> >
> >     why do we need a hashmap to access those statically defined values?

Done.


> On Sept. 4, 2013, 11:36 p.m., Julien Le Dem wrote:
> > src/org/apache/pig/builtin/StreamingUDF.java, line 51
> > <https://reviews.apache.org/r/13781/diff/1/?file=344449#file344449line51>
> >
> >     enum?

An enum seems a little more verbose without much benefit in this case.  I don't 
feel strongly about it though.


> On Sept. 4, 2013, 11:36 p.m., Julien Le Dem wrote:
> > src/org/apache/pig/builtin/StreamingUDF.java, line 382
> > <https://reviews.apache.org/r/13781/diff/1/?file=344449#file344449line382>
> >
> >     if (info) log.info

Fixed.  This log statement was removed entirely.


> On Sept. 4, 2013, 11:36 p.m., Julien Le Dem wrote:
> > src/org/apache/pig/builtin/StreamingUDFException.java, line 23
> > <https://reviews.apache.org/r/13781/diff/1/?file=344450#file344450line23>
> >
> >     if you don't make the parent see the cause printStackTrace won't 
> > display the cause's stackTrace

Done.


> On Sept. 4, 2013, 11:36 p.m., Julien Le Dem wrote:
> > src/org/apache/pig/impl/util/StorageUtil.java, line 176
> > <https://reviews.apache.org/r/13781/diff/1/?file=344460#file344460line176>
> >
> >     why is that?

No good reason, so I added BigDecimal and BigInteger support.


- Jeremy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13781/#review25913
-----------------------------------------------------------


On Sept. 5, 2013, 8:23 p.m., Jeremy Karn wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13781/
> -----------------------------------------------------------
> 
> (Updated Sept. 5, 2013, 8:23 p.m.)
> 
> 
> Review request for pig.
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Changes for PIG-2417 (https://issues.apache.org/jira/browse/PIG-2417)
> 
> 
> Diffs
> -----
> 
>   build.xml 7e22192 
>   src/org/apache/pig/PigToStream.java 7cc2950 
>   src/org/apache/pig/StreamToPig.java ff24b27 
>   src/org/apache/pig/builtin/PigStreaming.java 5467693 
>   src/org/apache/pig/builtin/PigToStreamUDF.java PRE-CREATION 
>   src/org/apache/pig/builtin/StreamUDFToPig.java PRE-CREATION 
>   src/org/apache/pig/builtin/StreamingDelimiters.java PRE-CREATION 
>   src/org/apache/pig/builtin/StreamingUDF.java PRE-CREATION 
>   src/org/apache/pig/builtin/StreamingUDFException.java PRE-CREATION 
>   src/org/apache/pig/builtin/StreamingUDFOutputSchemaException.java 
> PRE-CREATION 
>   src/org/apache/pig/impl/streaming/DefaultInputHandler.java 301bea3 
>   src/org/apache/pig/impl/streaming/DefaultOutputHandler.java 1b46e7d 
>   src/org/apache/pig/impl/streaming/ExecutableManager.java cf79c83 
>   src/org/apache/pig/impl/streaming/InputHandler.java 690d94e 
>   src/org/apache/pig/impl/streaming/OutputHandler.java 6e9262a 
>   src/org/apache/pig/impl/streaming/StreamingUDFOutputHandler.java 
> PRE-CREATION 
>   src/org/apache/pig/impl/streaming/StreamingUtil.java PRE-CREATION 
>   src/org/apache/pig/impl/util/JarManager.java 5c4acb0 
>   src/org/apache/pig/impl/util/StorageUtil.java dcb62ec 
>   src/org/apache/pig/scripting/ScriptEngine.java 29a9e1f 
>   src/org/apache/pig/scripting/ScriptingIllustrateOutputCapturer.java 
> PRE-CREATION 
>   src/org/apache/pig/scripting/streaming/python/PythonScriptEngine.java 
> PRE-CREATION 
>   src/python/streaming/controller.py PRE-CREATION 
>   src/python/streaming/pig_util.py PRE-CREATION 
>   test/org/apache/pig/builtin/TestPigToStreamUDF.java PRE-CREATION 
>   test/org/apache/pig/builtin/TestStreamUDFToPig.java PRE-CREATION 
>   test/org/apache/pig/builtin/TestStreamingUDF.java PRE-CREATION 
>   test/org/apache/pig/impl/streaming/TestExecutableManager.java 6246019 
>   test/org/apache/pig/impl/streaming/TestStreamingUDFOutputHandler.java 
> PRE-CREATION 
>   test/org/apache/pig/impl/streaming/TestStreamingUtil.java PRE-CREATION 
>   test/org/apache/pig/test/TestPigStreaming.java PRE-CREATION 
>   test/org/apache/pig/test/TestStreaming.java 1eac5d2 
>   test/python/streaming/test_controller.py PRE-CREATION 
>   test/unit-tests d52ad9d 
> 
> Diff: https://reviews.apache.org/r/13781/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jeremy Karn
> 
>

Reply via email to