[jira] Commented: (PIG-911) [Piggybank] SequenceFileLoader
[ https://issues.apache.org/jira/browse/PIG-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12755709#action_12755709 ] Alan Gates commented on PIG-911: I'm reviewing this patch > [Piggybank] SequenceFileLoader > --- > > Key: PIG-911 > URL: https://issues.apache.org/jira/browse/PIG-911 > Project: Pig > Issue Type: New Feature >Reporter: Dmitriy V. Ryaboy > Attachments: pig_911.2.patch, pig_sequencefile.patch > > > The proposed piggybank contribution adds a SequenceFileLoader to the > piggybank. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-911) [Piggybank] SequenceFileLoader
[ https://issues.apache.org/jira/browse/PIG-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744373#action_12744373 ] Hadoop QA commented on PIG-911: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12416830/pig_911.2.patch against trunk revision 804406. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/168/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/168/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/168/console This message is automatically generated. > [Piggybank] SequenceFileLoader > --- > > Key: PIG-911 > URL: https://issues.apache.org/jira/browse/PIG-911 > Project: Pig > Issue Type: New Feature >Reporter: Dmitriy V. Ryaboy > Attachments: pig_911.2.patch, pig_sequencefile.patch > > > The proposed piggybank contribution adds a SequenceFileLoader to the > piggybank. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-911) [Piggybank] SequenceFileLoader
[ https://issues.apache.org/jira/browse/PIG-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744343#action_12744343 ] Dmitriy V. Ryaboy commented on PIG-911: --- Concerning making this a StoreFunc, as well -- the StoreFunc interface is not very friendly to this. All you get in the bind call is the output stream; for LoadFunc, you also get the name of the file (or, presumably, whatever it was the user passed in under the guise of a file name). This means that for the LoadFunc, I was able to use the passed in filename to back into a Path and a FileSystem. I can't do the same for StoreFunc, where the filename is not available -- only the output stream is. That means I can't create the appropriate SequenceFile.Writer . Is there a way around this limitation that does not involve requiring special constructor parameters to be used? Is it possible to change the StoreFunc api to provide this information, or to make it available through some side channel (MapRedUtils or similar)? > [Piggybank] SequenceFileLoader > --- > > Key: PIG-911 > URL: https://issues.apache.org/jira/browse/PIG-911 > Project: Pig > Issue Type: New Feature >Reporter: Dmitriy V. Ryaboy > Attachments: pig_911.2.patch, pig_sequencefile.patch > > > The proposed piggybank contribution adds a SequenceFileLoader to the > piggybank. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-911) [Piggybank] SequenceFileLoader
[ https://issues.apache.org/jira/browse/PIG-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742565#action_12742565 ] Dmitriy V. Ryaboy commented on PIG-911: --- Alan, Thanks for the feedback. I'll add the try/catch In regards to the UTF8StorageConverter -- I think I added that because before that the code broke if you didn't declare a schema at load time (so, a=load 'foo' using SequenceFileLoader() as (a,b) instead of a=load 'foo' using SequenceFileLoader() as (a:chararray, b:double) I'll figure out what exactly is going on with that and remove the UTF8StorageConverter Will add Store as time allows. > [Piggybank] SequenceFileLoader > --- > > Key: PIG-911 > URL: https://issues.apache.org/jira/browse/PIG-911 > Project: Pig > Issue Type: New Feature >Reporter: Dmitriy V. Ryaboy > Attachments: pig_sequencefile.patch > > > The proposed piggybank contribution adds a SequenceFileLoader to the > piggybank. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-911) [Piggybank] SequenceFileLoader
[ https://issues.apache.org/jira/browse/PIG-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742239#action_12742239 ] Alan Gates commented on PIG-911: Dmitry, First this is great. We've had requests to read Sequence files. Being able to write them also would be great. A few thoughts: 1) This should not extend UTF8StorageConverter. This loader will be returning actual data types, not bytes that need to be interpreted. I would think instead that it should implement the bytesToX() methods itself and just throw an exception saying it didn't expect to do any conversion. 2) The getSampledTuple looks fine if skip is handling getting the stream to the point that reading the next tuple is viable. 3) In the bindTo call, where you obtain the key and value by reflection, should there be a try/catch block there in case the cast to Writable fails? In the same way, in describe schema you're asking how to suppress warnings from the cast in reader.getKeyClass(). But don't you want to check that what you got really is a writable, since there is no guarantee? > [Piggybank] SequenceFileLoader > --- > > Key: PIG-911 > URL: https://issues.apache.org/jira/browse/PIG-911 > Project: Pig > Issue Type: New Feature >Reporter: Dmitriy V. Ryaboy > Attachments: pig_sequencefile.patch > > > The proposed piggybank contribution adds a SequenceFileLoader to the > piggybank. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-911) [Piggybank] SequenceFileLoader
[ https://issues.apache.org/jira/browse/PIG-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740290#action_12740290 ] Hadoop QA commented on PIG-911: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12415673/pig_sequencefile.patch against trunk revision 801460. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/153/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/153/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/153/console This message is automatically generated. > [Piggybank] SequenceFileLoader > --- > > Key: PIG-911 > URL: https://issues.apache.org/jira/browse/PIG-911 > Project: Pig > Issue Type: New Feature >Reporter: Dmitriy V. Ryaboy > Attachments: pig_sequencefile.patch > > > The proposed piggybank contribution adds a SequenceFileLoader to the > piggybank. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.