[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688696#comment-15688696
]
Xianda Ke commented on PIG-5029:
Hi [~kellyzly],
Salted key solution seem OK. JDK's Random is a
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15624556#comment-15624556
]
liyunzhang_intel commented on PIG-5029:
---
[~knoguchi]:
Before we discussed a lot about the skewed
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530152#comment-15530152
]
Koji Noguchi commented on PIG-5029:
---
{quote}
bq. how is pig handling skew for MR/TEZ?
Sampling is done and
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530114#comment-15530114
]
Rohini Palaniswamy commented on PIG-5029:
-
bq. how is pig handling skew for MR/TEZ?
Sampling is
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530104#comment-15530104
]
Koji Noguchi commented on PIG-5029:
---
Thanks a million [~tgraves].
> Optimize sort case when data is
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530065#comment-15530065
]
Thomas Graves commented on PIG-5029:
Is the question whether spark supports maps that aren't idempotent?
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15528532#comment-15528532
]
liyunzhang_intel commented on PIG-5029:
---
[~knoguchi]: thanks for patience on this jira. I will enable
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15526775#comment-15526775
]
Koji Noguchi commented on PIG-5029:
---
[~kellyzly], I believe I explained how pure RANDOM key would break
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15524845#comment-15524845
]
liyunzhang_intel commented on PIG-5029:
---
[~vanzin]: I guess what [~rohini] and [~knoguchi] worries is
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15524832#comment-15524832
]
Marcelo Vanzin commented on PIG-5029:
-
I'm not sure I understand the question. If you're worried about
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15524780#comment-15524780
]
liyunzhang_intel commented on PIG-5029:
---
[~rohini] and [~knoguchi]: thanks for your patience on this
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15521883#comment-15521883
]
liyunzhang_intel commented on PIG-5029:
---
[~vanzin]: Thanks for your comment, here i have a question
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15516915#comment-15516915
]
Marcelo Vanzin commented on PIG-5029:
-
Not really my area of expertise; but this reminds me of Ted
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15515605#comment-15515605
]
liyunzhang_intel commented on PIG-5029:
---
[~kexianda], [~mohitsabharwal] , [~pallavi.rao] and
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15505624#comment-15505624
]
Xianda Ke commented on PIG-5029:
[~kellyzly],
when task re-run, the partitioning is not the same the first
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15505394#comment-15505394
]
liyunzhang_intel commented on PIG-5029:
---
[~knoguchi]:
{quote}
If node goes down after
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15505381#comment-15505381
]
Koji Noguchi commented on PIG-5029:
---
bq. Hadoop will try new task attempt only when last task attempt fail
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15503610#comment-15503610
]
Koji Noguchi commented on PIG-5029:
---
[~kellyzly], this has nothing to do with speculative execution.
>
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15500402#comment-15500402
]
liyunzhang_intel commented on PIG-5029:
---
[~knoguchi]: Thanks for your reply.
Here is a question about
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490809#comment-15490809
]
Rohini Palaniswamy commented on PIG-5029:
-
bq. Although spark will sample the data automatically
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490595#comment-15490595
]
Koji Noguchi commented on PIG-5029:
---
bq. Can you explain more about why this cause data loss and
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489116#comment-15489116
]
liyunzhang_intel commented on PIG-5029:
---
[~rohini] and [~knoguchi]: Thanks for interest in this
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15488515#comment-15488515
]
Rohini Palaniswamy commented on PIG-5029:
-
[~knoguchi] just pointed out this jira to me as there is
[
https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486676#comment-15486676
]
liyunzhang_intel commented on PIG-5029:
---
The solution to solve the skewed data sort in PIG-5029.patch
24 matches
Mail list logo