Yes, that should work. I will use InputFormat.getNext from the SampleLoader
to skip the records.
Thanks,
Thejas
On 11/3/09 6:39 PM, Alan Gates ga...@yahoo-inc.com wrote:
We definitely want to avoid parsing every tuple when sampling. But do
we need to implement a special function for it? Pig
[
https://issues.apache.org/jira/browse/PIG-970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeff Zhang updated PIG-970:
---
Status: Open (was: Patch Available)
Support of HBase 0.20.0
---
Key:
In the new implementation of SampleLoader subclasses (used by order-by,
skew-join ..) as part of the loader redesign, we are not only reading all
the records input but also parsing them as pig tuples.
This is because the SampleLoaders are wrappers around the actual input
loaders specified in the
[
https://issues.apache.org/jira/browse/PIG-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ankit Modi updated PIG-1036:
Attachment: LeftOuterFRJoin.patch
Attaching a new patch.
The join now only supports two way Left join.
[
https://issues.apache.org/jira/browse/PIG-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1277#action_1277
]
Alan Gates commented on PIG-1048:
-
When attempting to apply this patch to the 0.5 branch, I
[
https://issues.apache.org/jira/browse/PIG-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12773103#action_12773103
]
Alan Gates commented on PIG-970:
When I run TestHBaseStorage now I get:
Testcase:
[
https://issues.apache.org/jira/browse/PIG-970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alan Gates updated PIG-970:
---
Attachment: test-output.tgz
TEST-org.apache.pig.test.TestHBaseStorage.txt
Test run results plus
We definitely want to avoid parsing every tuple when sampling. But do
we need to implement a special function for it? Pig will have access
to the InputFormat instance, correct? Can it not call
InputFormat.getNext the desired number of times (which will not parse
the tuple) and then call
[
https://issues.apache.org/jira/browse/PIG-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12773348#action_12773348
]
Alan Gates commented on PIG-970:
afterside:~/src/pig/PIG-970-3/trunk jar tf
[
https://issues.apache.org/jira/browse/PIG-1002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Olga Natkovich resolved PIG-1002.
-
Resolution: Fixed
this has been addressed in other JIRAs
FINDBUGS: BC: Equals method should not
[
https://issues.apache.org/jira/browse/PIG-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pradeep Kamath updated PIG-1036:
Resolution: Fixed
Fix Version/s: 0.6.0
Hadoop Flags: [Reviewed]
Status:
[
https://issues.apache.org/jira/browse/PIG-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12773309#action_12773309
]
Jeff Zhang commented on PIG-970:
Alan, do you have file hbase-site.xml in folder test ? ( I
[
https://issues.apache.org/jira/browse/PIG-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ankit Modi updated PIG-1036:
Attachment: (was: LeftOuterFRJoin.patch)
Fragment-replicate left outer join
[
https://issues.apache.org/jira/browse/PIG-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12773295#action_12773295
]
Pradeep Kamath commented on PIG-966:
I have updated
[
https://issues.apache.org/jira/browse/PIG-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ankit Modi updated PIG-1036:
Status: Open (was: Patch Available)
Fragment-replicate left outer join
--
[
https://issues.apache.org/jira/browse/PIG-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12773297#action_12773297
]
Jeff Zhang commented on PIG-970:
yes, Alan, Could you attach the whole log including the logs
[
https://issues.apache.org/jira/browse/PIG-970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeff Zhang updated PIG-970:
---
Attachment: (was: Pig_HBase_0.20.0.patch)
Support of HBase 0.20.0
---
[
https://issues.apache.org/jira/browse/PIG-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Olga Natkovich updated PIG-1058:
Resolution: Fixed
Status: Resolved (was: Patch Available)
patch committed. Thanks, Pradeep,
[
https://issues.apache.org/jira/browse/PIG-958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12773181#action_12773181
]
Pradeep Kamath commented on PIG-958:
bq. 2. Deleting the temporary directory manually in
The twoLevelAccessRequired flag is not quite a long term solution to the
problem. The problem is that we treat output of relations to be bags but their
schemas do NOT have twoLevelAccessRequired to be true. Only bag constants and
bags from input data have this flag set to true. We need to move
[
https://issues.apache.org/jira/browse/PIG-997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12773192#action_12773192
]
Alan Gates commented on PIG-997:
After applying this patch TestColumnSecurity fails. The
[
https://issues.apache.org/jira/browse/PIG-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ankit Modi updated PIG-1036:
Status: Patch Available (was: Open)
Fragment-replicate left outer join
--
Pig Team is happy to announce Pig 0.5.0 release!
Pig is a Hadoop subproject that provides high-level data-flow language
and an execution framework for parallel computation on a Hadoop cluster.
More details about Pig can be found at http://hadoop.apache.org/pig/.
This release makes
[
https://issues.apache.org/jira/browse/PIG-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Richard Ding reassigned PIG-1071:
-
Assignee: Richard Ding
Support comma separated file/directory names in load statements
[
https://issues.apache.org/jira/browse/PIG-958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12773184#action_12773184
]
Pradeep Kamath commented on PIG-958:
I saw compile errors while trying to run unit test:
[
https://issues.apache.org/jira/browse/PIG-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12773314#action_12773314
]
Alan Gates commented on PIG-970:
Yes, it's there.
Support of HBase 0.20.0
[
https://issues.apache.org/jira/browse/PIG-997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yan Zhou updated PIG-997:
-
Status: Open (was: Patch Available)
The failure is due to a misplaced test in the nightly suite. I'm going to
[
https://issues.apache.org/jira/browse/PIG-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Olga Natkovich updated PIG-1058:
Status: Patch Available (was: Open)
FINDBUGS: remaining Correctness Warnings
[
https://issues.apache.org/jira/browse/PIG-970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeff Zhang updated PIG-970:
---
Attachment: Pig_HBase_0.20.0.patch
Alan, I find the problem. Before in eclipse I put the output folder to
Support comma separated file/directory names in load statements
---
Key: PIG-1071
URL: https://issues.apache.org/jira/browse/PIG-1071
Project: Pig
Issue Type: New Feature
[
https://issues.apache.org/jira/browse/PIG-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12773273#action_12773273
]
Hadoop QA commented on PIG-1036:
+1 overall. Here are the results of testing the latest
[
https://issues.apache.org/jira/browse/PIG-997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yan Zhou updated PIG-997:
-
Attachment: SortedTable.patch
[zebra] Sorted Table Support by Zebra
-
[
https://issues.apache.org/jira/browse/PIG-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Olga Natkovich updated PIG-1058:
Status: Open (was: Patch Available)
FINDBUGS: remaining Correctness Warnings
[
https://issues.apache.org/jira/browse/PIG-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12773229#action_12773229
]
Pradeep Kamath commented on PIG-1036:
-
+1, will commit once hudson QA comes back.
Thanks Pradeep,
I saw that comment. I guess my question is, given the solution this
comment describes, what are you referring to in the Load/Store
redesign doc when you say we must fix the two level access issues
with schema of bags in current schema before we make these changes,
otherwise that
From comments in Schema.java:
// In bags which have a schema with a tuple which contains
// the fields present in it, if we access the second field (say)
// we are actually trying to access the second field in the
// tuple in the bag. This is currently true for two cases:
// 1)
[
https://issues.apache.org/jira/browse/PIG-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12773339#action_12773339
]
Jeff Zhang commented on PIG-970:
Well, it's weird.
Alan, could check again that the
[
https://issues.apache.org/jira/browse/PIG-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Olga Natkovich updated PIG-1058:
Attachment: PIG-1058_v2.patch
Addressed unit test failures
FINDBUGS: remaining Correctness
[
https://issues.apache.org/jira/browse/PIG-958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12773389#action_12773389
]
Ankur commented on PIG-958:
---
Can you explain this a little bit more - ..
In the earlier patch
Hi pig team,
I¹m testing zebra v2 and trying to run the pig 0.60 jar that I got from Yan.
However, I got the following error:
Caused by: java.lang.ClassNotFoundException: jline.ConsoleReaderInputStream
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at
[
https://issues.apache.org/jira/browse/PIG-997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yan Zhou updated PIG-997:
-
Status: Patch Available (was: Open)
[zebra] Sorted Table Support by Zebra
-
41 matches
Mail list logo