[
https://issues.apache.org/jira/browse/MAPREDUCE-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336923#comment-15336923
]
Daniel Templeton commented on MAPREDUCE-6712:
-
Pyspark actually does some cleverness under
[
https://issues.apache.org/jira/browse/MAPREDUCE-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335164#comment-15335164
]
He Tianyi commented on MAPREDUCE-6712:
--
Oh, just another thought: maybe scenarios like this can
[
https://issues.apache.org/jira/browse/MAPREDUCE-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326214#comment-15326214
]
He Tianyi commented on MAPREDUCE-6712:
--
Hi, [~templedf]. Thanks for these ideas.
I think the null
[
https://issues.apache.org/jira/browse/MAPREDUCE-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324821#comment-15324821
]
Daniel Templeton commented on MAPREDUCE-6712:
-
[~He Tianyi], after chewing on this a bit
[
https://issues.apache.org/jira/browse/MAPREDUCE-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323690#comment-15323690
]
Daniel Templeton commented on MAPREDUCE-6712:
-
For C++ apps, there's Hadoop Pipes, which
[
https://issues.apache.org/jira/browse/MAPREDUCE-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321658#comment-15321658
]
He Tianyi commented on MAPREDUCE-6712:
--
Actually in my experiements (in-house workload) turning
[
https://issues.apache.org/jira/browse/MAPREDUCE-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15320692#comment-15320692
]
Daniel Templeton commented on MAPREDUCE-6712:
-
Hadoop Streaming is limited by the fact