[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-11-22 Thread Xianda Ke (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688696#comment-15688696 ] Xianda Ke commented on PIG-5029: Hi [~kellyzly], Salted key solution seem OK. JDK's Random is a

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-11-01 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15624556#comment-15624556 ] liyunzhang_intel commented on PIG-5029: --- [~knoguchi]: Before we discussed a lot about the skewed

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-28 Thread Koji Noguchi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530152#comment-15530152 ] Koji Noguchi commented on PIG-5029: --- {quote} bq. how is pig handling skew for MR/TEZ? Sampling is done and

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-28 Thread Rohini Palaniswamy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530114#comment-15530114 ] Rohini Palaniswamy commented on PIG-5029: - bq. how is pig handling skew for MR/TEZ? Sampling is

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-28 Thread Koji Noguchi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530104#comment-15530104 ] Koji Noguchi commented on PIG-5029: --- Thanks a million [~tgraves]. > Optimize sort case when data is

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-28 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530065#comment-15530065 ] Thomas Graves commented on PIG-5029: Is the question whether spark supports maps that aren't idempotent?

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-28 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15528532#comment-15528532 ] liyunzhang_intel commented on PIG-5029: --- [~knoguchi]: thanks for patience on this jira. I will enable

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-27 Thread Koji Noguchi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15526775#comment-15526775 ] Koji Noguchi commented on PIG-5029: --- [~kellyzly], I believe I explained how pure RANDOM key would break

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-26 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15524845#comment-15524845 ] liyunzhang_intel commented on PIG-5029: --- [~vanzin]: I guess what [~rohini] and [~knoguchi] worries is

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-26 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15524832#comment-15524832 ] Marcelo Vanzin commented on PIG-5029: - I'm not sure I understand the question. If you're worried about

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-26 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15524780#comment-15524780 ] liyunzhang_intel commented on PIG-5029: --- [~rohini] and [~knoguchi]: thanks for your patience on this

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-25 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15521883#comment-15521883 ] liyunzhang_intel commented on PIG-5029: --- [~vanzin]: Thanks for your comment, here i have a question

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-23 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15516915#comment-15516915 ] Marcelo Vanzin commented on PIG-5029: - Not really my area of expertise; but this reminds me of Ted

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-23 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15515605#comment-15515605 ] liyunzhang_intel commented on PIG-5029: --- [~kexianda], [~mohitsabharwal] , [~pallavi.rao] and

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-19 Thread Xianda Ke (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15505624#comment-15505624 ] Xianda Ke commented on PIG-5029: [~kellyzly], when task re-run, the partitioning is not the same the first

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-19 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15505394#comment-15505394 ] liyunzhang_intel commented on PIG-5029: --- [~knoguchi]: {quote} If node goes down after

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-19 Thread Koji Noguchi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15505381#comment-15505381 ] Koji Noguchi commented on PIG-5029: --- bq. Hadoop will try new task attempt only when last task attempt fail

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-19 Thread Koji Noguchi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15503610#comment-15503610 ] Koji Noguchi commented on PIG-5029: --- [~kellyzly], this has nothing to do with speculative execution. >

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-18 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15500402#comment-15500402 ] liyunzhang_intel commented on PIG-5029: --- [~knoguchi]: Thanks for your reply. Here is a question about

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-14 Thread Rohini Palaniswamy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490809#comment-15490809 ] Rohini Palaniswamy commented on PIG-5029: - bq. Although spark will sample the data automatically

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-14 Thread Koji Noguchi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490595#comment-15490595 ] Koji Noguchi commented on PIG-5029: --- bq. Can you explain more about why this cause data loss and

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-13 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489116#comment-15489116 ] liyunzhang_intel commented on PIG-5029: --- [~rohini] and [~knoguchi]: Thanks for interest in this

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-13 Thread Rohini Palaniswamy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15488515#comment-15488515 ] Rohini Palaniswamy commented on PIG-5029: - [~knoguchi] just pointed out this jira to me as there is

[jira] [Commented] (PIG-5029) Optimize sort case when data is skewed

2016-09-13 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/PIG-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486676#comment-15486676 ] liyunzhang_intel commented on PIG-5029: --- The solution to solve the skewed data sort in PIG-5029.patch