[jira] Commented: (PIG-924) Make Pig work with multiple versions of Hadoop
[ https://issues.apache.org/jira/browse/PIG-924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12744953#action_12744953 ] Hadoop QA commented on PIG-924: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12416945/pig_924.3.patch against trunk revision 804406. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/173/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/173/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/173/console This message is automatically generated. Make Pig work with multiple versions of Hadoop -- Key: PIG-924 URL: https://issues.apache.org/jira/browse/PIG-924 Project: Pig Issue Type: Bug Reporter: Dmitriy V. Ryaboy Attachments: pig_924.2.patch, pig_924.3.patch, pig_924.patch The current Pig build scripts package hadoop and other dependencies into the pig.jar file. This means that if users upgrade Hadoop, they also need to upgrade Pig. Pig has relatively few dependencies on Hadoop interfaces that changed between 18, 19, and 20. It is possibly to write a dynamic shim that allows Pig to use the correct calls for any of the above versions of Hadoop. Unfortunately, the building process precludes us from the ability to do this at runtime, and forces an unnecessary Pig rebuild even if dynamic shims are created. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-925) Fix join in local mode
[ https://issues.apache.org/jira/browse/PIG-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12745018#action_12745018 ] Hudson commented on PIG-925: Integrated in Pig-trunk #527 (See [http://hudson.zones.apache.org/hudson/job/Pig-trunk/527/]) : Fix join in local mode Fix join in local mode -- Key: PIG-925 URL: https://issues.apache.org/jira/browse/PIG-925 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.4.0 Attachments: PIG-925-1.patch, PIG-925-2.patch, PIG-925-3.patch Join is broken after LOJoin patch (Optimizer_Phase5.patch of [PIG-697|https://issues.apache.org/jira/browse/PIG-697). Even the simplest join script is not working under local mode: eg: a = load '1.txt'; b = load '2.txt'; c = join a by $0, b by $0; dump c; Caused by: java.lang.NullPointerException at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getNext(POPackage.java:206) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:191) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231) at org.apache.pig.backend.local.executionengine.physicalLayer.counters.POCounter.getNext(POCounter.java:71) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117) at org.apache.pig.backend.local.executionengine.LocalPigLauncher.runPipeline(LocalPigLauncher.java:146) at org.apache.pig.backend.local.executionengine.LocalPigLauncher.launchPig(LocalPigLauncher.java:109) at org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execute(LocalExecutionEngine.java:165) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-833) Storage access layer
[ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12745049#action_12745049 ] He Yongqiang commented on PIG-833: -- Can add more description/explain in this jira or wiki page about usage etc, such as schma format, storage format, projection, and partition? Storage access layer Key: PIG-833 URL: https://issues.apache.org/jira/browse/PIG-833 Project: Pig Issue Type: New Feature Reporter: Jay Tang Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, PIG-833-zebra.patch.bz2, PIG-833-zebra.patch.bz2, TEST-org.apache.hadoop.zebra.pig.TestCheckin1.txt, test.out, zebra-javadoc.tgz A layer is needed to provide a high level data access abstraction and a tabular view of data in Hadoop, and could free Pig users from implementing their own data storage/retrieval code. This layer should also include a columnar storage format in order to provide fast data projection, CPU/space-efficient data serialization, and a schema language to manage physical storage metadata. Eventually it could also support predicate pushdown for further performance improvement. Initially, this layer could be a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-924) Make Pig work with multiple versions of Hadoop
[ https://issues.apache.org/jira/browse/PIG-924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12745109#action_12745109 ] Dmitriy V. Ryaboy commented on PIG-924: --- Regarding deprecation -- I tried setting it back to off, and adding @SuppressWarnings(deprecation) to the shims for 20, but and complained about deprecation nonetheless. Not sure what its deal is. Adding something like this to the main build.xml works. Does this seem like a reasonable solution? {code} !-- set deprecation off if hadoop version greater or equals 20 -- target name=set_deprecation condition property=hadoop_is20 equals arg1=${hadoop.version} arg2=20/ /condition antcall target=if_hadoop_is20/ antcall target=if_hadoop_not20/ /target target name=if_hadoop_is20 if=hadoop_is20 property name=javac.deprecation value=off / /target target name=if_hadoop_not20 unless=hadoop_is20 property name=javac.deprecation value=on / /target target name=init depends=set_deprecation [] {code} Make Pig work with multiple versions of Hadoop -- Key: PIG-924 URL: https://issues.apache.org/jira/browse/PIG-924 Project: Pig Issue Type: Bug Reporter: Dmitriy V. Ryaboy Attachments: pig_924.2.patch, pig_924.3.patch, pig_924.patch The current Pig build scripts package hadoop and other dependencies into the pig.jar file. This means that if users upgrade Hadoop, they also need to upgrade Pig. Pig has relatively few dependencies on Hadoop interfaces that changed between 18, 19, and 20. It is possibly to write a dynamic shim that allows Pig to use the correct calls for any of the above versions of Hadoop. Unfortunately, the building process precludes us from the ability to do this at runtime, and forces an unnecessary Pig rebuild even if dynamic shims are created. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-833) Storage access layer
[ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12745125#action_12745125 ] Jing Huang commented on PIG-833: Zebra supports int, long, float, double, bool, collection (equivalent to Pig Bag), map, record (equivalent to Pig Tuple), string, bytes (equivalent to Pig Bytearray) Storage access layer Key: PIG-833 URL: https://issues.apache.org/jira/browse/PIG-833 Project: Pig Issue Type: New Feature Reporter: Jay Tang Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, PIG-833-zebra.patch.bz2, PIG-833-zebra.patch.bz2, TEST-org.apache.hadoop.zebra.pig.TestCheckin1.txt, test.out, zebra-javadoc.tgz A layer is needed to provide a high level data access abstraction and a tabular view of data in Hadoop, and could free Pig users from implementing their own data storage/retrieval code. This layer should also include a columnar storage format in order to provide fast data projection, CPU/space-efficient data serialization, and a schema language to manage physical storage metadata. Eventually it could also support predicate pushdown for further performance improvement. Initially, this layer could be a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-928) UDFs in scripting languages
[ https://issues.apache.org/jira/browse/PIG-928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-928: --- Attachment: package.zip Attaching some preliminary work by Kishore Gopalakrishna on this. This code is a good start, but not ready for inclusion. It needs to be cleaned up, put in our class structure, etc. Comments from Kishore: It contains all the libraries required and also the GenericEval UDF and GenericFilter UDF I dint get a chance to get the Algebraic function working. To test it, just unzip the package and run rm -rf wordcount/output; pig -x local wordcount.pig --- to test eval pig -x local wordcount_filter.pig --- to test filter [sorry it should be named filter.pig] cat wordcount/output UDFs in scripting languages --- Key: PIG-928 URL: https://issues.apache.org/jira/browse/PIG-928 Project: Pig Issue Type: New Feature Reporter: Alan Gates Attachments: package.zip It should be possible to write UDFs in scripting languages such as python, ruby, etc. This frees users from needing to compile Java, generate a jar, etc. It also opens Pig to programmers who prefer scripting languages over Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-926) Merge-Join phase 2
[ https://issues.apache.org/jira/browse/PIG-926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-926: - Attachment: (was: mj_phase2_1.patch) Merge-Join phase 2 -- Key: PIG-926 URL: https://issues.apache.org/jira/browse/PIG-926 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor This jira is created to keep track of phase-2 work for MergeJoin. Various limitations exist in phase-1 for Merge Join which are listed on: http://wiki.apache.org/pig/PigMergeJoin Those will be addressed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-924) Make Pig work with multiple versions of Hadoop
[ https://issues.apache.org/jira/browse/PIG-924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12745160#action_12745160 ] Daniel Dai commented on PIG-924: From your latest patch, shims works this way 1. The version of shims Pig compiles is controlled by hadoop.version property in build.xml 2. The version of shims Pig uses is determined dynamically by hacking the string returned by VersionInfo.getVersion As in your code comment, version string hack is not safe. My thinking is that pig only use bundled hadoop unless override: 1. Pig compile all version of shims, There is no conflict between different version of shims, why not compile them all? So user do not need to recompile the code if he want to use different external hadoop. 2. Pig bundles a default hadoop, which is specified by hadoop.version in build.xml. Pig use this version of shims by default 3. If user want to use an external hadoop, he/she need to override the default hadoop version explicitly, eg, -Dhadoop_version in command line. Make Pig work with multiple versions of Hadoop -- Key: PIG-924 URL: https://issues.apache.org/jira/browse/PIG-924 Project: Pig Issue Type: Bug Reporter: Dmitriy V. Ryaboy Attachments: pig_924.2.patch, pig_924.3.patch, pig_924.patch The current Pig build scripts package hadoop and other dependencies into the pig.jar file. This means that if users upgrade Hadoop, they also need to upgrade Pig. Pig has relatively few dependencies on Hadoop interfaces that changed between 18, 19, and 20. It is possibly to write a dynamic shim that allows Pig to use the correct calls for any of the above versions of Hadoop. Unfortunately, the building process precludes us from the ability to do this at runtime, and forces an unnecessary Pig rebuild even if dynamic shims are created. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-926) Merge-Join phase 2
[ https://issues.apache.org/jira/browse/PIG-926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-926: - Status: Open (was: Patch Available) Merge-Join phase 2 -- Key: PIG-926 URL: https://issues.apache.org/jira/browse/PIG-926 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: mj_phase2_1.patch This jira is created to keep track of phase-2 work for MergeJoin. Various limitations exist in phase-1 for Merge Join which are listed on: http://wiki.apache.org/pig/PigMergeJoin Those will be addressed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-924) Make Pig work with multiple versions of Hadoop
[ https://issues.apache.org/jira/browse/PIG-924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12745166#action_12745166 ] Todd Lipcon commented on PIG-924: - bq. If existing deployments need a single pig.jar without a hadoop dependency, it might be possible to create a new target (pig-all) that would create a statically bundled jar; but I think the default behavior should be to not bundle, build all the shims, and use whatever hadoop is on the path. +1 for making the default to *not* bundle hadoop inside pig.jar, and adding another non-default target for those people who might want it. bq. The current patch is written as is so that it can be applied to trunk, enabling people to compile statically, and only require a change to the ant build files to switch to a dynamic compile later on (after 0.4, probably) From the packager's perspective, I'd love if this change could get in for 0.4. If it doesn't, we'll end up applying the patch ourselves for packaging purposes - we need to have the hadoop dependency be on the user's installed hadoop, not on whatever happened to get bundled into pig.jar. Make Pig work with multiple versions of Hadoop -- Key: PIG-924 URL: https://issues.apache.org/jira/browse/PIG-924 Project: Pig Issue Type: Bug Reporter: Dmitriy V. Ryaboy Attachments: pig_924.2.patch, pig_924.3.patch, pig_924.patch The current Pig build scripts package hadoop and other dependencies into the pig.jar file. This means that if users upgrade Hadoop, they also need to upgrade Pig. Pig has relatively few dependencies on Hadoop interfaces that changed between 18, 19, and 20. It is possibly to write a dynamic shim that allows Pig to use the correct calls for any of the above versions of Hadoop. Unfortunately, the building process precludes us from the ability to do this at runtime, and forces an unnecessary Pig rebuild even if dynamic shims are created. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-926) Merge-Join phase 2
[ https://issues.apache.org/jira/browse/PIG-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12745181#action_12745181 ] Pradeep Kamath commented on PIG-926: In MRCompiler: You should change: {code} indexerArgs[0] = rightLoader.getLFile().getFuncName(); to indexerArgs[0] = rightLoader.getLFile().getFuncSpec().toString(); {code} to handle the case where the loader may have constructor args (like PigStorage(,) - PigStorage with comma as delim) In the error message when the loader does not implement SamplableLoader, you can change: {noformat} This loader doesn't implement it. to The loader specified in + indexerArgs[0] + doesn't implement it {noformat} Otherwise looks good. Merge-Join phase 2 -- Key: PIG-926 URL: https://issues.apache.org/jira/browse/PIG-926 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: mj_phase2_1.patch This jira is created to keep track of phase-2 work for MergeJoin. Various limitations exist in phase-1 for Merge Join which are listed on: http://wiki.apache.org/pig/PigMergeJoin Those will be addressed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-926) Merge-Join phase 2
[ https://issues.apache.org/jira/browse/PIG-926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-926: - Status: Open (was: Patch Available) Merge-Join phase 2 -- Key: PIG-926 URL: https://issues.apache.org/jira/browse/PIG-926 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: mj_phase2_1.patch This jira is created to keep track of phase-2 work for MergeJoin. Various limitations exist in phase-1 for Merge Join which are listed on: http://wiki.apache.org/pig/PigMergeJoin Those will be addressed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Build failed in Hudson: Pig-Patch-minerva.apache.org #174
See http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/174/changes Changes: [daijy] PIG-925: Fix join in local mode -- started Building remotely on minerva.apache.org (Ubuntu) Updating http://svn.apache.org/repos/asf/hadoop/pig/trunk U test/org/apache/pig/test/TestLocal2.java U CHANGES.txt U src/org/apache/pig/backend/local/executionengine/physicalLayer/LocalLogToPhyTranslationVisitor.java Fetching 'http://svn.apache.org/repos/asf/hadoop/nightly/test-patch' at -1 into 'http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/ws/trunk/test/bin' At revision 805964 At revision 805964 no change for http://svn.apache.org/repos/asf/hadoop/nightly/test-patch since the previous build [Pig-Patch-minerva.apache.org] $ /bin/bash /tmp/hudson4495449519603127293.sh /home/hudson/tools/java/latest1.6/bin/java Buildfile: build.xml check-for-findbugs: findbugs.check: java5.check: forrest.check: hudson-test-patch: [exec] [exec] [exec] == [exec] == [exec] Testing patch for PIG-926. [exec] == [exec] == [exec] [exec] [exec] Reverted 'test/org/apache/pig/test/MiniCluster.java' [exec] Reverted 'src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java' [exec] Reverted 'src/org/apache/pig/backend/hadoop/datastorage/HConfiguration.java' [exec] Reverted 'src/org/apache/pig/backend/hadoop/datastorage/HDataStorage.java' [exec] Reverted 'src/org/apache/pig/tools/pigstats/PigStats.java' [exec] Reverted 'src/org/apache/pig/impl/io/NullableBytesWritable.java' [exec] Reverted 'build.xml' [exec] Reverted 'contrib/piggybank/java/build.xml' [exec] [exec] Fetching external item into 'test/bin' [exec] Atest/bin/test-patch.sh [exec] Updated external to revision 805964. [exec] [exec] Updated to revision 805964. [exec] PIG-926 is not Patch Available. Exiting. [exec] % Total% Received % Xferd Average Speed TimeTime Time Current [exec] Dload Upload Total Spent Left Speed [exec] [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] [exec] 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0 BUILD SUCCESSFUL Total time: 9 seconds ERROR: No artifacts found that match the file pattern trunk/build/test/findbugs/newPatchFindbugsWarnings.html,trunk/patchprocess/*Warnings.txt. Configuration error? ERROR: 'trunk/build/test/findbugs/newPatchFindbugsWarnings.html' doesn't match anything: 'trunk' exists but not 'trunk/build/test/findbugs/newPatchFindbugsWarnings.html' Recording test results Description found: PIG-926
[jira] Updated: (PIG-926) Merge-Join phase 2
[ https://issues.apache.org/jira/browse/PIG-926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-926: - Attachment: (was: mj_phase2_1.patch) Merge-Join phase 2 -- Key: PIG-926 URL: https://issues.apache.org/jira/browse/PIG-926 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: mj_phase2_1.patch This jira is created to keep track of phase-2 work for MergeJoin. Various limitations exist in phase-1 for Merge Join which are listed on: http://wiki.apache.org/pig/PigMergeJoin Those will be addressed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-926) Merge-Join phase 2
[ https://issues.apache.org/jira/browse/PIG-926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-926: - Attachment: mj_phase2_1.patch Updated patch addressing Pradeep's comments. Merge-Join phase 2 -- Key: PIG-926 URL: https://issues.apache.org/jira/browse/PIG-926 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: mj_phase2_1.patch This jira is created to keep track of phase-2 work for MergeJoin. Various limitations exist in phase-1 for Merge Join which are listed on: http://wiki.apache.org/pig/PigMergeJoin Those will be addressed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-926) Merge-Join phase 2
[ https://issues.apache.org/jira/browse/PIG-926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-926: - Status: Patch Available (was: Open) Merge-Join phase 2 -- Key: PIG-926 URL: https://issues.apache.org/jira/browse/PIG-926 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: mj_phase2_1.patch This jira is created to keep track of phase-2 work for MergeJoin. Various limitations exist in phase-1 for Merge Join which are listed on: http://wiki.apache.org/pig/PigMergeJoin Those will be addressed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-833) Storage access layer
[ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12745219#action_12745219 ] Raghu Angadi commented on PIG-833: -- Thanks Jing. There are some PIG examples listed at the bottom of Zebra wiki : http://wiki.apache.org/pig/zebra (wiki is still under construction). Just listing java strings in Jing's comment with out Jira formatting : {noformat} final static String STR_SCHEMA = s1:bool, s2:int, s3:long, s4:float, s5:string, s6:bytes, + r1:record(f1:int, f2:long), r2:record(r3:record(f3:float, f4)), + m1:map(string),m2:map(map(int)), c:collection(f13:double, f14:float, f15:bytes); final static String STR_STORAGE = [s1, s2]; [m1#{a}]; [r1.f1]; [s3, s4, r2.r3.f3]; [s5, s6, m2#{x|y}]; + [r1.f2, m1#{b}]; [r2.r3.f4, m2#{z}]; {noformat} Storage access layer Key: PIG-833 URL: https://issues.apache.org/jira/browse/PIG-833 Project: Pig Issue Type: New Feature Reporter: Jay Tang Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, PIG-833-zebra.patch.bz2, PIG-833-zebra.patch.bz2, TEST-org.apache.hadoop.zebra.pig.TestCheckin1.txt, test.out, zebra-javadoc.tgz A layer is needed to provide a high level data access abstraction and a tabular view of data in Hadoop, and could free Pig users from implementing their own data storage/retrieval code. This layer should also include a columnar storage format in order to provide fast data projection, CPU/space-efficient data serialization, and a schema language to manage physical storage metadata. Eventually it could also support predicate pushdown for further performance improvement. Initially, this layer could be a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-833) Storage access layer
[ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12745250#action_12745250 ] He Yongqiang commented on PIG-833: -- Thanks Jing. I now have a better understand of schema and columngroups. What the projection and partition are used for? Storage access layer Key: PIG-833 URL: https://issues.apache.org/jira/browse/PIG-833 Project: Pig Issue Type: New Feature Reporter: Jay Tang Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, PIG-833-zebra.patch.bz2, PIG-833-zebra.patch.bz2, TEST-org.apache.hadoop.zebra.pig.TestCheckin1.txt, test.out, zebra-javadoc.tgz A layer is needed to provide a high level data access abstraction and a tabular view of data in Hadoop, and could free Pig users from implementing their own data storage/retrieval code. This layer should also include a columnar storage format in order to provide fast data projection, CPU/space-efficient data serialization, and a schema language to manage physical storage metadata. Eventually it could also support predicate pushdown for further performance improvement. Initially, this layer could be a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-926) Merge-Join phase 2
[ https://issues.apache.org/jira/browse/PIG-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12745270#action_12745270 ] Hadoop QA commented on PIG-926: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417053/mj_phase2_1.patch against trunk revision 805684. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/175/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/175/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/175/console This message is automatically generated. Merge-Join phase 2 -- Key: PIG-926 URL: https://issues.apache.org/jira/browse/PIG-926 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: mj_phase2_1.patch This jira is created to keep track of phase-2 work for MergeJoin. Various limitations exist in phase-1 for Merge Join which are listed on: http://wiki.apache.org/pig/PigMergeJoin Those will be addressed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.