Merge changes from trunk to branch-0.1 for Pig 0.1.1 release
Key: PIG-541
URL: https://issues.apache.org/jira/browse/PIG-541
Project: Pig
Issue Type: Task
Reporter: Olg
Hi,
I have a script roughly analogous to this:
users = LOAD '/users.tsv' AS (id);
sessions = LOAD '/sessions.tsv' AS (id, userid, duration, day);
user_sessions = JOIN users BY id INNER, sessions BY userid INNER;
intermediate_aggregate = FOREACH (GROUP user_sessions BY (userid, day))
{
Hello,
The sollution to your problem lies in storing intermediate_aggregate to a
file, and then reloading it.
i.e.
intermediate_aggregate = FOREACH (GROUP user_sessions BY (userid, day))
{
//Code omitted
}
-- SNIP
store intermediate_aggregate into '/intermediate.tsv'
intermediate_agg
[
https://issues.apache.org/jira/browse/PIG-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Olga Natkovich updated PIG-541:
---
Attachment: hadoop18.jar
> Merge changes from trunk to branch-0.1 for Pig 0.1.1 release
> --
[
https://issues.apache.org/jira/browse/PIG-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Olga Natkovich updated PIG-541:
---
Attachment: trunk_2_branch.patch
> Merge changes from trunk to branch-0.1 for Pig 0.1.1 release
> --
[
https://issues.apache.org/jira/browse/PIG-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Olga Natkovich updated PIG-541:
---
Status: Patch Available (was: Open)
I have just added the patch that brings branch-0.1 to the same code
pig gets confused about schema, when joining a table that has a known schema
with one that doesn't
--
Key: PIG-542
URL: https://issues.apache.org/jira/browse/PIG-542