[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263690#comment-16263690
]
Hyukjin Kwon commented on SPARK-22240:
--
Sure, sounds good !
> S3 CSV number of partitions
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16255861#comment-16255861
]
Steve Loughran commented on SPARK-22240:
I think there are two separate issues and I'm adding
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16223564#comment-16223564
]
Hyukjin Kwon commented on SPARK-22240:
--
I am sorry, I have been super busy these for few days.
So,
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16223400#comment-16223400
]
Steve Loughran commented on SPARK-22240:
so this partition calculation problem is independent of
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217398#comment-16217398
]
Steve Loughran commented on SPARK-22240:
no, spark 2.2 doesn't fix this. I have to explicitly
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217325#comment-16217325
]
Steve Loughran commented on SPARK-22240:
I'm doing some testing with master & reading files off
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204019#comment-16204019
]
Hyukjin Kwon commented on SPARK-22240:
--
Here, I could reproduce the issue in the JIRA by the steps
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203966#comment-16203966
]
Steve Loughran commented on SPARK-22240:
Point me at a simple test suite for the multiline & I'll
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203771#comment-16203771
]
Hyukjin Kwon commented on SPARK-22240:
--
Yea, this problem exists when {{multiLine}}. This is
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203709#comment-16203709
]
Steve Loughran commented on SPARK-22240:
[~hyukjin.kwon]: we now see that on s3a, you only ever
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203600#comment-16203600
]
Hyukjin Kwon commented on SPARK-22240:
--
I am sorry for late response.
{quote}
If it's multiline
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203495#comment-16203495
]
Steve Loughran commented on SPARK-22240:
We've got a test in HADOOP-14943 which looks @ part
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202691#comment-16202691
]
Arthur Baudry commented on SPARK-22240:
---
[~hyukjin.kwon] Yes it is a single file so even counting
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201863#comment-16201863
]
Steve Loughran commented on SPARK-22240:
thanks. Now for a question which is probably obvious to
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201407#comment-16201407
]
Hyukjin Kwon commented on SPARK-22240:
--
There were multiple JIRAs for this feature. I believe
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200485#comment-16200485
]
Steve Loughran commented on SPARK-22240:
What's the link to the multiline JIRA? As that could
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200413#comment-16200413
]
Hyukjin Kwon commented on SPARK-22240:
--
Hm, but this particular issue looks more like related when
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200367#comment-16200367
]
Steve Loughran commented on SPARK-22240:
Amazon EMR is amazon's own fork of Spark & Hadoop, with
[
https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200333#comment-16200333
]
Hyukjin Kwon commented on SPARK-22240:
--
This is an unfortunate limitation when {{multiLine}} is
19 matches
Mail list logo