> On Sept. 7, 2013, 12:04 a.m., Daniel Dai wrote:
> > src/org/apache/pig/builtin/StreamUDFToPig.java, line 47
> > <https://reviews.apache.org/r/13781/diff/2/?file=348274#file348274line47>
> >
> >     This might be affected by PIG-3255

Looks like it would be easy to make this patch work with PIG-3255.  If we could 
get that merged shortly I'd update this patch with the necessary changes.


> On Sept. 7, 2013, 12:04 a.m., Daniel Dai wrote:
> > src/org/apache/pig/builtin/StreamUDFToPig.java, line 109
> > <https://reviews.apache.org/r/13781/diff/2/?file=348274#file348274line109>
> >
> >     How do we handle delimit escaping? Seems we don't handle it right now?

We don't handle it now.  We use a 3 character delimiter which avoids collisions 
in practice.  I would actually like to switch the serialization to a 'real' 
serialization library but we did a little bit of prototyping a few months ago 
and there wasn't anything out there that worked really well and had significant 
performance improvements.  Avro was the most promising but on the Python side 
it was missing a number of features we needed.


> On Sept. 7, 2013, 12:04 a.m., Daniel Dai wrote:
> > src/org/apache/pig/builtin/StreamingUDF.java, line 169
> > <https://reviews.apache.org/r/13781/diff/2/?file=348276#file348276line169>
> >
> >     Are those files only for illustrate? Why not just use a random tmp 
> > file? Why do we distinguish exectype in illustrate? It only run locally.

I renamed the class to remove Illustrate from the name.  This is actually used 
both for illustrate (where we capture the output only from the last run) and in 
real runs (where we write the output to the user logs).  By putting it in the 
user logs you can get the logs from the task tracker web interface.  We could 
use temp files but I like the more human readable names.


> On Sept. 7, 2013, 12:04 a.m., Daniel Dai wrote:
> > src/org/apache/pig/builtin/StreamingUDF.java, line 198
> > <https://reviews.apache.org/r/13781/diff/2/?file=348276#file348276line198>
> >
> >     Any reason not to create a StreamingUDFInputHandler?

I didn't really think it was necessary but I guess its more consistent.


> On Sept. 7, 2013, 12:04 a.m., Daniel Dai wrote:
> > src/org/apache/pig/builtin/StreamingUDF.java, line 366
> > <https://reviews.apache.org/r/13781/diff/2/?file=348276#file348276line366>
> >
> >     Seems flush() is more proper to put inside inputHandler

I'm not sure if it would be unnecessary for some clients of InputHandler.  For 
the StreamingUDF we need to make sure we always flush or else we can get 
blocked on the reading process not getting the full input (and then never 
writing the output) but I'm not sure if its always necessary.


> On Sept. 7, 2013, 12:04 a.m., Daniel Dai wrote:
> > src/org/apache/pig/impl/streaming/InputHandler.java, line 66
> > <https://reviews.apache.org/r/13781/diff/2/?file=348282#file348282line66>
> >
> >     Why we need this change?

Probably just left over from debugging where I wanted to know the value of the 
byte[].


> On Sept. 7, 2013, 12:04 a.m., Daniel Dai wrote:
> > test/unit-tests, line 73
> > <https://reviews.apache.org/r/13781/diff/2/?file=348302#file348302line73>
> >
> >     This is not true unit test, should remove from the list

Can it go in commit-tests?


- Jeremy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13781/#review25964
-----------------------------------------------------------


On Sept. 5, 2013, 8:23 p.m., Jeremy Karn wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13781/
> -----------------------------------------------------------
> 
> (Updated Sept. 5, 2013, 8:23 p.m.)
> 
> 
> Review request for pig.
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Changes for PIG-2417 (https://issues.apache.org/jira/browse/PIG-2417)
> 
> 
> Diffs
> -----
> 
>   build.xml 7e22192 
>   src/org/apache/pig/PigToStream.java 7cc2950 
>   src/org/apache/pig/StreamToPig.java ff24b27 
>   src/org/apache/pig/builtin/PigStreaming.java 5467693 
>   src/org/apache/pig/builtin/PigToStreamUDF.java PRE-CREATION 
>   src/org/apache/pig/builtin/StreamUDFToPig.java PRE-CREATION 
>   src/org/apache/pig/builtin/StreamingDelimiters.java PRE-CREATION 
>   src/org/apache/pig/builtin/StreamingUDF.java PRE-CREATION 
>   src/org/apache/pig/builtin/StreamingUDFException.java PRE-CREATION 
>   src/org/apache/pig/builtin/StreamingUDFOutputSchemaException.java 
> PRE-CREATION 
>   src/org/apache/pig/impl/streaming/DefaultInputHandler.java 301bea3 
>   src/org/apache/pig/impl/streaming/DefaultOutputHandler.java 1b46e7d 
>   src/org/apache/pig/impl/streaming/ExecutableManager.java cf79c83 
>   src/org/apache/pig/impl/streaming/InputHandler.java 690d94e 
>   src/org/apache/pig/impl/streaming/OutputHandler.java 6e9262a 
>   src/org/apache/pig/impl/streaming/StreamingUDFOutputHandler.java 
> PRE-CREATION 
>   src/org/apache/pig/impl/streaming/StreamingUtil.java PRE-CREATION 
>   src/org/apache/pig/impl/util/JarManager.java 5c4acb0 
>   src/org/apache/pig/impl/util/StorageUtil.java dcb62ec 
>   src/org/apache/pig/scripting/ScriptEngine.java 29a9e1f 
>   src/org/apache/pig/scripting/ScriptingIllustrateOutputCapturer.java 
> PRE-CREATION 
>   src/org/apache/pig/scripting/streaming/python/PythonScriptEngine.java 
> PRE-CREATION 
>   src/python/streaming/controller.py PRE-CREATION 
>   src/python/streaming/pig_util.py PRE-CREATION 
>   test/org/apache/pig/builtin/TestPigToStreamUDF.java PRE-CREATION 
>   test/org/apache/pig/builtin/TestStreamUDFToPig.java PRE-CREATION 
>   test/org/apache/pig/builtin/TestStreamingUDF.java PRE-CREATION 
>   test/org/apache/pig/impl/streaming/TestExecutableManager.java 6246019 
>   test/org/apache/pig/impl/streaming/TestStreamingUDFOutputHandler.java 
> PRE-CREATION 
>   test/org/apache/pig/impl/streaming/TestStreamingUtil.java PRE-CREATION 
>   test/org/apache/pig/test/TestPigStreaming.java PRE-CREATION 
>   test/org/apache/pig/test/TestStreaming.java 1eac5d2 
>   test/python/streaming/test_controller.py PRE-CREATION 
>   test/unit-tests d52ad9d 
> 
> Diff: https://reviews.apache.org/r/13781/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jeremy Karn
> 
>

Reply via email to