[
https://issues.apache.org/jira/browse/SPARK-46981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837039#comment-17837039
]
Jarred Li commented on SPARK-46981:
---
I used default driver memory setting(1GB), OOM was thrown out. It
[
https://issues.apache.org/jira/browse/SPARK-42069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jarred Li updated SPARK-42069:
--
Description:
When write table with shuffle data and non-deterministic function, data may be
Jarred Li created SPARK-42069:
-
Summary: Data duplicate or data lost with non-deterministic
function
Key: SPARK-42069
URL: https://issues.apache.org/jira/browse/SPARK-42069
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-32582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178445#comment-17178445
]
Jarred Li commented on SPARK-32582:
---
??I am not sure it would be helpful since there is no API in
[
https://issues.apache.org/jira/browse/SPARK-32582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175428#comment-17175428
]
Jarred Li edited comment on SPARK-32582 at 8/11/20, 10:07 AM:
--
I think this
[
https://issues.apache.org/jira/browse/SPARK-32582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175428#comment-17175428
]
Jarred Li commented on SPARK-32582:
---
I think this is one limitation of ORC file infer schema.
[
https://issues.apache.org/jira/browse/SPARK-32582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175165#comment-17175165
]
Jarred Li edited comment on SPARK-32582 at 8/11/20, 7:06 AM:
-
The
[
https://issues.apache.org/jira/browse/SPARK-32582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175165#comment-17175165
]
Jarred Li edited comment on SPARK-32582 at 8/11/20, 7:05 AM:
-
The
[
https://issues.apache.org/jira/browse/SPARK-32582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175165#comment-17175165
]
Jarred Li edited comment on SPARK-32582 at 8/11/20, 6:59 AM:
-
The
[
https://issues.apache.org/jira/browse/SPARK-32582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jarred Li updated SPARK-32582:
--
Description:
When infer schema is enabled, it tries to list all the files in the table,
however only
[
https://issues.apache.org/jira/browse/SPARK-32582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175165#comment-17175165
]
Jarred Li commented on SPARK-32582:
---
The performance I mentioned here is not the read file, but "LIST"
Jarred Li created SPARK-32582:
-
Summary: Spark SQL Infer Schema Performance
Key: SPARK-32582
URL: https://issues.apache.org/jira/browse/SPARK-32582
Project: Spark
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/SPARK-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jarred Li updated SPARK-4164:
-
Remaining Estimate: 2h
Original Estimate: 2h
spark.kryo.registrator shall use comma separated value
[
https://issues.apache.org/jira/browse/SPARK-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191475#comment-14191475
]
Jarred Li commented on SPARK-4164:
--
I can work on this issue. Could somebody assign this
[
https://issues.apache.org/jira/browse/SPARK-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jarred Li resolved SPARK-3980.
--
Resolution: Not a Problem
Resolved the issue by running the job with big cluster.
GraphX Performance
Jarred Li created SPARK-3980:
Summary: GraphX Performance Issue
Key: SPARK-3980
URL: https://issues.apache.org/jira/browse/SPARK-3980
Project: Spark
Issue Type: Bug
Components: GraphX
[
https://issues.apache.org/jira/browse/SPARK-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jarred Li updated SPARK-3980:
-
Description: I run 4 workes in AWS (c3.xlarge), 4g memory for executor,
85,331,846 edges
[
https://issues.apache.org/jira/browse/SPARK-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jarred Li updated SPARK-3980:
-
Description:
I run 4 workes in AWS (c3.xlarge), 4g memory for executor, 85,331,846 edges
18 matches
Mail list logo