[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2016-06-30 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357535#comment-15357535 ] Nicholas Chammas commented on SPARK-: - {quote} Python itself has no compile time type safety.

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2016-06-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356650#comment-15356650 ] Reynold Xin commented on SPARK-: Unfortunately I think that's still necessary. The problem is that

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2016-06-30 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356647#comment-15356647 ] Maciej Bryński commented on SPARK-: --- OK. So what about this patch ?

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2016-06-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356644#comment-15356644 ] Reynold Xin commented on SPARK-: After thinking about that more, I don't think it will happen any

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2016-06-30 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356633#comment-15356633 ] Maciej Bryński commented on SPARK-: --- [~rxin] What about Python API ? What's the target version ?

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2015-11-23 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022634#comment-15022634 ] Sandy Ryza commented on SPARK-: --- [~nchammas] it's not clear that it makes sense to add a similar API

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2015-11-23 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022883#comment-15022883 ] Xiao Li commented on SPARK-: Agree. The major performance gain of Dataset should be from Catalyst

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2015-11-23 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023242#comment-15023242 ] Xiao Li commented on SPARK-: Thank you! I think the users might need to understand the potential

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2015-11-23 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023243#comment-15023243 ] Xiao Li commented on SPARK-: Thank you! I think the users might need to understand the potential

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2015-11-23 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023241#comment-15023241 ] Xiao Li commented on SPARK-: Thank you! I think the users might need to understand the potential

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2015-11-23 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022735#comment-15022735 ] Nicholas Chammas commented on SPARK-: - [~sandyr] - Hmm, so are you saying that, generally

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2015-11-23 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022765#comment-15022765 ] Maciej Bryński commented on SPARK-: --- I think we check types also in Python. As I understand

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2015-11-23 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022957#comment-15022957 ] Nicholas Chammas commented on SPARK-: - If you are referring to my comment, note that I am

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2015-11-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023139#comment-15023139 ] Reynold Xin commented on SPARK-: [~nchammas] Dataset actually will be slightly slower than

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2015-11-23 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023144#comment-15023144 ] Xiao Li commented on SPARK-: Will you publish the exact performance penalty? Is it obvious? For

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2015-11-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023147#comment-15023147 ] Reynold Xin commented on SPARK-: We haven't measured it yet, and as you said it is highly workload

[jira] [Commented] (SPARK-9999) Dataset API on top of Catalyst/DataFrame

2015-11-20 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15019214#comment-15019214 ] Nicholas Chammas commented on SPARK-: - Arriving a little late to this discussion. Quick