[jira] [Comment Edited] (SPARK-12179) Spark SQL get different result with the same code
[ https://issues.apache.org/jira/browse/SPARK-12179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15670679#comment-15670679 ] Herman van Hovell edited comment on SPARK-12179 at 11/16/16 3:19 PM: - [~litao1990] is this still a problem? Lemme know, I am closing for now due to lack of activity. was (Author: hvanhovell): [~litao1990] is this still a problem? > Spark SQL get different result with the same code > - > > Key: SPARK-12179 > URL: https://issues.apache.org/jira/browse/SPARK-12179 > Project: Spark > Issue Type: Bug > Components: Spark Core, SQL >Affects Versions: 1.3.0, 1.3.1, 1.3.2, 1.4.0, 1.4.1, 1.4.2, 1.5.0, 1.5.1, > 1.5.2, 1.5.3 > Environment: hadoop version: 2.5.0-cdh5.3.2 > spark version: 1.5.3 > run mode: yarn-client >Reporter: Tao Li >Priority: Critical > > I run the sql in yarn-client mode, but get different result each time. > As you can see the example, I get the different shuffle write with the same > shuffle read in two jobs with the same code. > Some of my spark app runs well, but some always met this problem. And I met > this problem on spark 1.3, 1.4 and 1.5 version. > Can you give me some suggestions about the possible causes or how do I figure > out the problem? > 1. First Run > Details for Stage 9 (Attempt 0) > Total Time Across All Tasks: 5.8 min > Shuffle Read: 24.4 MB / 205399 > Shuffle Write: 6.8 MB / 54934 > 2. Second Run > Details for Stage 9 (Attempt 0) > Total Time Across All Tasks: 5.6 min > Shuffle Read: 24.4 MB / 205399 > Shuffle Write: 6.8 MB / 54905 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-12179) Spark SQL get different result with the same code
[ https://issues.apache.org/jira/browse/SPARK-12179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15060217#comment-15060217 ] Davies Liu edited comment on SPARK-12179 at 12/16/15 4:04 PM: -- Which version of Spark are you using? Can you try latest 1.5 branch or 1.6 RC? was (Author: davies): Which version of Spark are you using? > Spark SQL get different result with the same code > - > > Key: SPARK-12179 > URL: https://issues.apache.org/jira/browse/SPARK-12179 > Project: Spark > Issue Type: Bug > Components: Spark Core, SQL >Affects Versions: 1.3.0, 1.3.1, 1.3.2, 1.4.0, 1.4.1, 1.4.2, 1.5.0, 1.5.1, > 1.5.2, 1.5.3 > Environment: hadoop version: 2.5.0-cdh5.3.2 > spark version: 1.5.3 > run mode: yarn-client >Reporter: Tao Li >Priority: Critical > > I run the sql in yarn-client mode, but get different result each time. > As you can see the example, I get the different shuffle write with the same > shuffle read in two jobs with the same code. > Some of my spark app runs well, but some always met this problem. And I met > this problem on spark 1.3, 1.4 and 1.5 version. > Can you give me some suggestions about the possible causes or how do I figure > out the problem? > 1. First Run > Details for Stage 9 (Attempt 0) > Total Time Across All Tasks: 5.8 min > Shuffle Read: 24.4 MB / 205399 > Shuffle Write: 6.8 MB / 54934 > 2. Second Run > Details for Stage 9 (Attempt 0) > Total Time Across All Tasks: 5.6 min > Shuffle Read: 24.4 MB / 205399 > Shuffle Write: 6.8 MB / 54905 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-12179) Spark SQL get different result with the same code
[ https://issues.apache.org/jira/browse/SPARK-12179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15047193#comment-15047193 ] Davies Liu edited comment on SPARK-12179 at 12/8/15 6:24 PM: - There are two directions to narrow down the problem: 1) simplify the query until removing anything from it the problem will gone 2) remove the customized configurations (for example, extraJavaOptions), until remove anything of them the problem will gone. This could be a critical bug, hopefully we could find a way to fix it. was (Author: davies): There are two direction to narrow down the problem: 1) simplify the query until removing anything from it the problem will gone 2) remove the customized configurations (for example, extraJavaOptions), until remove anything of them the problem will gone. This could be a critical bug, hopefully we could find a way to fix it. > Spark SQL get different result with the same code > - > > Key: SPARK-12179 > URL: https://issues.apache.org/jira/browse/SPARK-12179 > Project: Spark > Issue Type: Bug > Components: Spark Core, SQL >Affects Versions: 1.3.0, 1.3.1, 1.3.2, 1.4.0, 1.4.1, 1.4.2, 1.5.0, 1.5.1, > 1.5.2, 1.5.3 > Environment: hadoop version: 2.5.0-cdh5.3.2 > spark version: 1.5.3 > run mode: yarn-client >Reporter: Tao Li >Priority: Critical > > I run the sql in yarn-client mode, but get different result each time. > As you can see the example, I get the different shuffle write with the same > shuffle read in two jobs with the same code. > Some of my spark app runs well, but some always met this problem. And I met > this problem on spark 1.3, 1.4 and 1.5 version. > Can you give me some suggestions about the possible causes or how do I figure > out the problem? > 1. First Run > Details for Stage 9 (Attempt 0) > Total Time Across All Tasks: 5.8 min > Shuffle Read: 24.4 MB / 205399 > Shuffle Write: 6.8 MB / 54934 > 2. Second Run > Details for Stage 9 (Attempt 0) > Total Time Across All Tasks: 5.6 min > Shuffle Read: 24.4 MB / 205399 > Shuffle Write: 6.8 MB / 54905 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-12179) Spark SQL get different result with the same code
[ https://issues.apache.org/jira/browse/SPARK-12179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15047965#comment-15047965 ] Tao Li edited comment on SPARK-12179 at 12/9/15 6:56 AM: - The row_number implementation is as follows: package UDF; import java.io.PrintStream; import org.apache.hadoop.hive.ql.exec.UDF; public class row_number extends UDF { private static int MAX_VALUE = 50; private static String[] comparedColumn = new String[MAX_VALUE]; private static int rowNum = 1; public int evaluate(Object[] args) { String[] columnValue = new String[args.length]; for (int i = 0; i < args.length; i++) { columnValue[i] = (args[i] == null ? "" : args[i].toString()); } if (rowNum == 1) { for (int i = 0; i < columnValue.length; i++) { comparedColumn[i] = columnValue[i]; } } for (int i = 0; i < columnValue.length; i++) { if (!comparedColumn[i].equals(columnValue[i])) { for (int j = 0; j < columnValue.length; j++) { comparedColumn[j] = columnValue[j]; } rowNum = 1; return rowNum++; } } return rowNum++; } } was (Author: litao1990): The row_number implementation is as follows: package UDF; import org.apache.hadoop.hive.ql.exec.UDF; public class RowNumber extends UDF { private static int MAX_VALUE = 50; private static String[] comparedColumn = new String[MAX_VALUE]; private static int rowNum = 1; public int evaluate(Object[] args) { String[] columnValue = new String[args.length]; for (int i = 0; i < args.length; i++) { columnValue[i] = (args[i] == null ? "" : args[i].toString()); } if (rowNum == 1) { for (int i = 0; i < columnValue.length; i++) { comparedColumn[i] = columnValue[i]; } } for (int i = 0; i < columnValue.length; i++) { if (!comparedColumn[i].equals(columnValue[i])) { for (int j = 0; j < columnValue.length; j++) { comparedColumn[j] = columnValue[j]; } rowNum = 1; return rowNum++; } } return rowNum++; } } > Spark SQL get different result with the same code > - > > Key: SPARK-12179 > URL: https://issues.apache.org/jira/browse/SPARK-12179 > Project: Spark > Issue Type: Bug > Components: Spark Core, SQL >Affects Versions: 1.3.0, 1.3.1, 1.3.2, 1.4.0, 1.4.1, 1.4.2, 1.5.0, 1.5.1, > 1.5.2, 1.5.3 > Environment: hadoop version: 2.5.0-cdh5.3.2 > spark version: 1.5.3 > run mode: yarn-client >Reporter: Tao Li >Priority: Critical > > I run the sql in yarn-client mode, but get different result each time. > As you can see the example, I get the different shuffle write with the same > shuffle read in two jobs with the same code. > Some of my spark app runs well, but some always met this problem. And I met > this problem on spark 1.3, 1.4 and 1.5 version. > Can you give me some suggestions about the possible causes or how do I figure > out the problem? > 1. First Run > Details for Stage 9 (Attempt 0) > Total Time Across All Tasks: 5.8 min > Shuffle Read: 24.4 MB / 205399 > Shuffle Write: 6.8 MB / 54934 > 2. Second Run > Details for Stage 9 (Attempt 0) > Total Time Across All Tasks: 5.6 min > Shuffle Read: 24.4 MB / 205399 > Shuffle Write: 6.8 MB / 54905 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-12179) Spark SQL get different result with the same code
[ https://issues.apache.org/jira/browse/SPARK-12179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15045053#comment-15045053 ] Tao Li edited comment on SPARK-12179 at 12/7/15 3:21 PM: - The query is on a hive table and the hive data is not changing. I think there are many factor will cause this problem, such as 1. is there some different in different hadoop node environment ? 2. is there some bugs on spark shuffle ? 3. is there some classpath or jar version problem ? 4. is the hive compatibility problem ? I think I can make some breakthrough on "shuffle write" number display on the web ui. Why the shuffle write is different? How to get the shuffle write number? Is there any factor will cause the shuffle write different? I will work on this case and figure it out. [~srowen] If you have any idea or experience, please let me know. Thank you very much! was (Author: litao1990): The query is on a hive table and the hive data is not changing. I think there are many factor will cause this problem, such as 1. is there some different in different hadoop node environment ? 2. is there some bugs on spark shuffle ? 3. is there some classpath or jar version problem ? 4. is the hive compatibility problem ? I think I can make some breakthrough on "shuffle write" number display on the web ui. Why the shuffle write is different? How to get the shuffle write number? Is there any factor will cause the shuffle write different? I will work on this cause and figure it out. [~srowen] If you have any idea or experience, please let me know. Thank you very much! > Spark SQL get different result with the same code > - > > Key: SPARK-12179 > URL: https://issues.apache.org/jira/browse/SPARK-12179 > Project: Spark > Issue Type: Bug > Components: Spark Core, SQL >Affects Versions: 1.3.0, 1.3.1, 1.3.2, 1.4.0, 1.4.1, 1.4.2, 1.5.0, 1.5.1, > 1.5.2, 1.5.3 > Environment: hadoop version: 2.5.0-cdh5.3.2 > spark version: 1.5.3 > run mode: yarn-client >Reporter: Tao Li >Priority: Minor > > I run the sql in yarn-client mode, but get different result each time. > As you can see the example, I get the different shuffle write with the same > shuffle read in two jobs with the same code. > Some of my spark app runs well, but some always met this problem. And I met > this problem on spark 1.3, 1.4 and 1.5 version. > Can you give me some suggestions about the possible causes or how do I figure > out the problem? > 1. First Run > Details for Stage 9 (Attempt 0) > Total Time Across All Tasks: 5.8 min > Shuffle Read: 24.4 MB / 205399 > Shuffle Write: 6.8 MB / 54934 > 2. Second Run > Details for Stage 9 (Attempt 0) > Total Time Across All Tasks: 5.6 min > Shuffle Read: 24.4 MB / 205399 > Shuffle Write: 6.8 MB / 54905 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-12179) Spark SQL get different result with the same code
[ https://issues.apache.org/jira/browse/SPARK-12179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046243#comment-15046243 ] Tao Li edited comment on SPARK-12179 at 12/8/15 3:00 AM: - I see there was some exceptions in my executors stderr log: 15/12/08 00:47:44 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 24 15/12/08 00:47:44 INFO storage.MemoryStore: ensureFreeSpace(45407) called with curMem=417720, maxMem=2893440614 15/12/08 00:47:44 INFO storage.MemoryStore: Block broadcast_24_piece0 stored as bytes in memory (estimated size 44.3 KB, free 2.7 GB) 15/12/08 00:47:44 INFO broadcast.TorrentBroadcast: Reading broadcast variable 24 took 34 ms 15/12/08 00:47:44 INFO storage.MemoryStore: ensureFreeSpace(527088) called with curMem=463127, maxMem=2893440614 15/12/08 00:47:44 INFO storage.MemoryStore: Block broadcast_24 stored as values in memory (estimated size 514.7 KB, free 2.7 GB) 15/12/08 00:47:44 WARN conf.Configuration: org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@661ec4ee:an attempt to override final parameter: mapreduce.reduce.speculative; Ignoring. 15/12/08 00:47:45 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore 15/12/08 00:47:45 INFO metastore.ObjectStore: ObjectStore, initialize called 15/12/08 00:47:45 WARN metastore.HiveMetaStore: Retrying creating default database after error: Class org.datanucleus.api.jdo.JDOPersistenceManagerFactory was not found. javax.jdo.JDOFatalUserException: Class org.datanucleus.api.jdo.JDOPersistenceManagerFactory was not found. at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1175) at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808) at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701) at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365) at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394) at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291) at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.hive.metastore.RawStoreProxy.(RawStoreProxy.java:57) at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:66) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:593) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:571) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:66) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72) at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:199) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:86) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024) at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234) at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174) at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:166) at org.apache.hadoop.hive.ql.plan.PlanUtils.configureJobPropertiesForStorageHandler(PlanUtils.java:803) at