[jira] [Commented] (SPARK-18976) in standlone mode,executor expired by HeartbeanReceiver that still take up cores but no tasks assigned to
[ https://issues.apache.org/jira/browse/SPARK-18976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15777821#comment-15777821 ] liujianhui commented on SPARK-18976: thanks for your attention, I found the root cause, the reason is same with the issue https://issues.apache.org/jira/browse/SPARK-18994, The master found the worker's heartbeat expired and then remove it, but the executor on that worker is always alive, since the standby Master becoming the active, this executor will reported to the new master along with the WorkerSchedulerStateResponse, the executor will add to corresponding app's executors list > in standlone mode,executor expired by HeartbeanReceiver that still take up > cores but no tasks assigned to > -- > > Key: SPARK-18976 > URL: https://issues.apache.org/jira/browse/SPARK-18976 > Project: Spark > Issue Type: Bug > Components: Deploy >Affects Versions: 1.6.1 > Environment: jdk1.8.0_77 Red Hat 4.4.7-11 >Reporter: liujianhui > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png > > > h2. scene > when executor expired by HeartbeatReceiver in driver, driver will mark that > executor as not live, task scheduler will not assign tasks to that executor, > but that executor's status will always be running and take up cores, the > executor 18 was expired and no task running, the task time far less than the > normal executor 142, but in app page, the executor is running > !screenshot-1.png! > !screenshot-2.png! > !screenshot-3.png! > h2.process: > # exeuctor expired by HearbeatReceiver because the last heartbeat execeed the > executor timeout > # executor will be removed in CoarseGrainedSchdulerBackend.killExecutors, so > that executor will marked as dead, it will not scheduled as offer since now > because it in executorsPendingToRemove > # status of that executor is running because the CoarseGrainedExecutorBackend > processor is also exist and it register block manager to the driver every > 10s, log as > {code} > 16/12/22 17:04:26 INFO Executor: Told to re-register on heartbeat > 16/12/22 17:04:26 INFO BlockManager: BlockManager re-registering with master > 16/12/22 17:04:26 INFO BlockManagerMaster: Trying to register BlockManager > 16/12/22 17:04:26 INFO BlockManagerMaster: Registered BlockManager > 16/12/22 17:04:26 INFO BlockManager: Reporting 0 blocks to the master. > 16/12/22 17:04:36 INFO Executor: Told to re-register on heartbeat > 16/12/22 17:04:36 INFO BlockManager: BlockManager re-registering with master > 16/12/22 17:04:36 INFO BlockManagerMaster: Trying to register BlockManager > 16/12/22 17:04:36 INFO BlockManagerMaster: Registered BlockManager > 16/12/22 17:04:36 INFO BlockManager: Reporting 0 blocks to the master. > 16/12/22 17:04:46 INFO Executor: Told to re-register on heartbeat > 16/12/22 17:04:46 INFO BlockManager: BlockManager re-registering with master > 16/12/22 17:04:46 INFO BlockManagerMaster: Trying to register BlockManager > 16/12/22 17:04:46 INFO BlockManagerMaster: Registered BlockManager > 16/12/22 17:04:46 INFO BlockManager: Reporting 0 blocks to the master. > 16/12/22 17:04:56 INFO Executor: Told to re-register on heartbeat > 16/12/22 17:04:56 INFO BlockManager: BlockManager re-registering with master > 16/12/22 17:04:56 INFO BlockManagerMaster: Trying to register BlockManager > 16/12/22 17:04:56 INFO BlockManagerMaster: Registered BlockManager > 16/12/22 17:04:56 INFO BlockManager: Reporting 0 blocks to the master. > {code} > h2. resolve > when the register times exceed some threshold(e.g. 10), the executor should > exit as zero -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18997) Recommended upgrade libthrift to 0.9.3
[ https://issues.apache.org/jira/browse/SPARK-18997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15777815#comment-15777815 ] Liang-Chi Hsieh commented on SPARK-18997: - I've checked the dependency and seems there is no conflict. {code} +-org.apache.thrift:libfb303:0.9.3 | +-org.apache.thrift:libthrift:0.9.3 | +-org.apache.httpcomponents:httpclient:4.4.1 (evicted by: 4.5.2) | +-org.apache.httpcomponents:httpclient:4.5.2 | | +-commons-codec:commons-codec:1.10 | | +-commons-codec:commons-codec:1.9 (evicted by: 1.10) | | +-commons-logging:commons-logging:1.2 | | +-org.apache.httpcomponents:httpcore:4.4.4 | | | +-org.apache.httpcomponents:httpcore:4.4.1 (evicted by: 4.4.4) | +-org.apache.httpcomponents:httpcore:4.4.4 | +-org.apache.thrift:libthrift:0.9.3 | +-org.apache.httpcomponents:httpclient:4.4.1 (evicted by: 4.5.2) | +-org.apache.httpcomponents:httpclient:4.5.2 | | +-commons-codec:commons-codec:1.10 | | +-commons-codec:commons-codec:1.9 (evicted by: 1.10) | | +-commons-logging:commons-logging:1.2 | | +-org.apache.httpcomponents:httpcore:4.4.4 | | | +-org.apache.httpcomponents:httpcore:4.4.1 (evicted by: 4.4.4) | +-org.apache.httpcomponents:httpcore:4.4.4 | {code} [~srowen] What do you think about this? Do we want to upgrade this? > Recommended upgrade libthrift to 0.9.3 > --- > > Key: SPARK-18997 > URL: https://issues.apache.org/jira/browse/SPARK-18997 > Project: Spark > Issue Type: Bug > Components: Build >Reporter: meiyoula >Priority: Critical > > libthrift 0.9.2 has a serious security vulnerability:CVE-2015-3254 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-17755) Master may ask a worker to launch an executor before the worker actually got the response of registration
[ https://issues.apache.org/jira/browse/SPARK-17755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-17755. -- Resolution: Fixed Fix Version/s: 2.2.0 > Master may ask a worker to launch an executor before the worker actually got > the response of registration > - > > Key: SPARK-17755 > URL: https://issues.apache.org/jira/browse/SPARK-17755 > Project: Spark > Issue Type: Bug > Components: Spark Core >Reporter: Yin Huai >Assignee: Shixiong Zhu > Fix For: 2.2.0 > > > I somehow saw a failed test {{org.apache.spark.DistributedSuite.caching in > memory, serialized, replicated}}. Its log shows that Spark master asked the > worker to launch an executor before the worker actually got the response of > registration. So, the master knew that the worker had been registered. But, > the worker did not know if it self had been registered. > {code} > 16/09/30 14:53:53.681 dispatcher-event-loop-0 INFO Master: Registering worker > localhost:38262 with 1 cores, 1024.0 MB RAM > 16/09/30 14:53:53.681 dispatcher-event-loop-0 INFO Master: Launching executor > app-20160930145353-/1 on worker worker-20160930145353-localhost-38262 > 16/09/30 14:53:53.682 dispatcher-event-loop-3 INFO > StandaloneAppClient$ClientEndpoint: Executor added: app-20160930145353-/1 > on worker-20160930145353-localhost-38262 (localhost:38262) with 1 cores > 16/09/30 14:53:53.683 dispatcher-event-loop-3 INFO > StandaloneSchedulerBackend: Granted executor ID app-20160930145353-/1 on > hostPort localhost:38262 with 1 cores, 1024.0 MB RAM > 16/09/30 14:53:53.683 dispatcher-event-loop-0 WARN Worker: Invalid Master > (spark://localhost:46460) attempted to launch executor. > 16/09/30 14:53:53.687 worker-register-master-threadpool-0 INFO Worker: > Successfully registered with master spark://localhost:46460 > {code} > Then, seems the worker did not launch any executor. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10872) Derby error (XSDB6) when creating new HiveContext after restarting SparkContext
[ https://issues.apache.org/jira/browse/SPARK-10872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15777808#comment-15777808 ] karthik G S commented on SPARK-10872: -- So, what is the possible fix on this issue? > Derby error (XSDB6) when creating new HiveContext after restarting > SparkContext > --- > > Key: SPARK-10872 > URL: https://issues.apache.org/jira/browse/SPARK-10872 > Project: Spark > Issue Type: Bug > Components: PySpark, SQL >Affects Versions: 1.4.0, 1.4.1, 1.5.0 >Reporter: Dmytro Bielievtsov > > Starting from spark 1.4.0 (works well on 1.3.1), the following code fails > with "XSDB6: Another instance of Derby may have already booted the database > ~/metastore_db": > {code:python} > from pyspark import SparkContext, HiveContext > sc = SparkContext("local[*]", "app1") > sql = HiveContext(sc) > sql.createDataFrame([[1]]).collect() > sc.stop() > sc = SparkContext("local[*]", "app2") > sql = HiveContext(sc) > sql.createDataFrame([[1]]).collect() # Py4J error > {code} > This is related to [#SPARK-9539], and I intend to restart spark context > several times for isolated jobs to prevent cache cluttering and GC errors. > Here's a larger part of the full error trace: > {noformat} > Failed to start database 'metastore_db' with class loader > org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@13015ec0, see > the next exception for details. > org.datanucleus.exceptions.NucleusDataStoreException: Failed to start > database 'metastore_db' with class loader > org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@13015ec0, see > the next exception for details. > at > org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:516) > at > org.datanucleus.store.rdbms.RDBMSStoreManager.(RDBMSStoreManager.java:298) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631) > at > org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301) > at > org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187) > at org.datanucleus.NucleusContext.initialise(NucleusContext.java:356) > at > org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:775) > at > org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333) > at > org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965) > at java.security.AccessController.doPrivileged(Native Method) > at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) > at > javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166) > at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808) > at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394) > at > org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291) > at > org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258) > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.(RawStoreProxy.java:57) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:66) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:593) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:571) > at > org.apache.h
[jira] [Commented] (SPARK-18996) Spark SQL support for post hooks
[ https://issues.apache.org/jira/browse/SPARK-18996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1599#comment-1599 ] Atul Payapilly commented on SPARK-18996: Yep that's right, that's exactly what I'm looking for. > Spark SQL support for post hooks > > > Key: SPARK-18996 > URL: https://issues.apache.org/jira/browse/SPARK-18996 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Atul Payapilly > > Spark SQL used to support Hive execution hooks(incidentally as a side effect > of using the Hive Driver in earlier versions) but no longer does so. More > details at: https://issues.apache.org/jira/browse/SPARK-18879. > The post hook functionality where it is possible to determine which > partitions were written to is extremely useful, Eg: Suppose the data is > written and then exported to an external system, without post hooks, this is > not possible to determine. > This feature request is to provide this capability, ideally using the Hive > exec hooks API if possible so users don't need to rewrite their hooks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18996) Spark SQL support for post hooks
[ https://issues.apache.org/jira/browse/SPARK-18996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1587#comment-1587 ] Xiao Li commented on SPARK-18996: - Are you requesting a Hive feature like https://issues.apache.org/jira/browse/HIVE-854? > Spark SQL support for post hooks > > > Key: SPARK-18996 > URL: https://issues.apache.org/jira/browse/SPARK-18996 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Atul Payapilly > > Spark SQL used to support Hive execution hooks(incidentally as a side effect > of using the Hive Driver in earlier versions) but no longer does so. More > details at: https://issues.apache.org/jira/browse/SPARK-18879. > The post hook functionality where it is possible to determine which > partitions were written to is extremely useful, Eg: Suppose the data is > written and then exported to an external system, without post hooks, this is > not possible to determine. > This feature request is to provide this capability, ideally using the Hive > exec hooks API if possible so users don't need to rewrite their hooks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-18999) simplify Literal codegen
[ https://issues.apache.org/jira/browse/SPARK-18999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18999: Assignee: Wenchen Fan (was: Apache Spark) > simplify Literal codegen > > > Key: SPARK-18999 > URL: https://issues.apache.org/jira/browse/SPARK-18999 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Wenchen Fan >Assignee: Wenchen Fan >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18999) simplify Literal codegen
[ https://issues.apache.org/jira/browse/SPARK-18999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1579#comment-1579 ] Apache Spark commented on SPARK-18999: -- User 'cloud-fan' has created a pull request for this issue: https://github.com/apache/spark/pull/16402 > simplify Literal codegen > > > Key: SPARK-18999 > URL: https://issues.apache.org/jira/browse/SPARK-18999 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Wenchen Fan >Assignee: Wenchen Fan >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-18999) simplify Literal codegen
[ https://issues.apache.org/jira/browse/SPARK-18999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18999: Assignee: Apache Spark (was: Wenchen Fan) > simplify Literal codegen > > > Key: SPARK-18999 > URL: https://issues.apache.org/jira/browse/SPARK-18999 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Wenchen Fan >Assignee: Apache Spark >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-18999) simplify Literal codegen
Wenchen Fan created SPARK-18999: --- Summary: simplify Literal codegen Key: SPARK-18999 URL: https://issues.apache.org/jira/browse/SPARK-18999 Project: Spark Issue Type: Improvement Components: SQL Reporter: Wenchen Fan Assignee: Wenchen Fan Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18675) CTAS for hive serde table should work for all hive versions
[ https://issues.apache.org/jira/browse/SPARK-18675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-18675: Fix Version/s: 2.0.3 > CTAS for hive serde table should work for all hive versions > --- > > Key: SPARK-18675 > URL: https://issues.apache.org/jira/browse/SPARK-18675 > Project: Spark > Issue Type: Bug > Components: SQL >Reporter: Wenchen Fan >Assignee: Wenchen Fan > Fix For: 2.0.3, 2.1.1, 2.2.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18237) hive.exec.stagingdir have no effect in spark2.0.1
[ https://issues.apache.org/jira/browse/SPARK-18237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-18237: Fix Version/s: 2.0.3 > hive.exec.stagingdir have no effect in spark2.0.1 > - > > Key: SPARK-18237 > URL: https://issues.apache.org/jira/browse/SPARK-18237 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.1 >Reporter: ClassNotFoundExp >Assignee: ClassNotFoundExp > Fix For: 2.0.3, 2.1.0 > > > hive.exec.stagingdir have no effect in spark2.0.1, > this relevant to https://issues.apache.org/jira/browse/SPARK-11021 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18703) Insertion/CTAS against Hive Tables: Staging Directories and Data Files Not Dropped Until Normal Termination of JVM
[ https://issues.apache.org/jira/browse/SPARK-18703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-18703: Fix Version/s: 2.0.3 > Insertion/CTAS against Hive Tables: Staging Directories and Data Files Not > Dropped Until Normal Termination of JVM > -- > > Key: SPARK-18703 > URL: https://issues.apache.org/jira/browse/SPARK-18703 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2 >Reporter: Xiao Li >Assignee: Xiao Li >Priority: Critical > Fix For: 2.0.3, 2.1.1, 2.2.0 > > > Below are the files/directories generated for three inserts againsts a Hive > table: > {noformat} > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-29_149_4298858301766472202-1 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-29_149_4298858301766472202-1/-ext-1 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-29_149_4298858301766472202-1/-ext-1/._SUCCESS.crc > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-29_149_4298858301766472202-1/-ext-1/.part-0.crc > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-29_149_4298858301766472202-1/-ext-1/_SUCCESS > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-29_149_4298858301766472202-1/-ext-1/part-0 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_454_6445008511655931341-1 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_454_6445008511655931341-1/-ext-1 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_454_6445008511655931341-1/-ext-1/._SUCCESS.crc > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_454_6445008511655931341-1/-ext-1/.part-0.crc > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_454_6445008511655931341-1/-ext-1/_SUCCESS > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_454_6445008511655931341-1/-ext-1/part-0 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_722_3388423608658711001-1 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_722_3388423608658711001-1/-ext-1 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_722_3388423608658711001-1/-ext-1/._SUCCESS.crc > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_722_3388423608658711001-1/-ext-1/.part-0.crc > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_722_3388423608658711001-1/-ext-1/_SUCCESS > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_722_3388423608658711001-1/-ext-1/part-0 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.part-0.crc > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/part-0 > {noformat} > The first 18 files are temporary. We do not drop it until the end of JVM > termination. If JVM does not appropriately terminate, these temporary > files/directories will not be dropped. > Only the last two files are needed, as shown below. > {noformat} > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.part-0.crc > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/part-0 > {noformat} > Ideally,
[jira] [Assigned] (SPARK-18998) Add a cbo conf to switch between default statistics and cbo estimated statistics
[ https://issues.apache.org/jira/browse/SPARK-18998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18998: Assignee: Apache Spark > Add a cbo conf to switch between default statistics and cbo estimated > statistics > > > Key: SPARK-18998 > URL: https://issues.apache.org/jira/browse/SPARK-18998 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 2.2.0 >Reporter: Zhenhua Wang >Assignee: Apache Spark > > We need a cbo configuration to switch between default stats and estimated > stats. We also need a new statistics method in LogicalPlan with conf as its > parameter, in order to pass the cbo switch and other estimation related > configurations in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-18998) Add a cbo conf to switch between default statistics and cbo estimated statistics
[ https://issues.apache.org/jira/browse/SPARK-18998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18998: Assignee: (was: Apache Spark) > Add a cbo conf to switch between default statistics and cbo estimated > statistics > > > Key: SPARK-18998 > URL: https://issues.apache.org/jira/browse/SPARK-18998 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 2.2.0 >Reporter: Zhenhua Wang > > We need a cbo configuration to switch between default stats and estimated > stats. We also need a new statistics method in LogicalPlan with conf as its > parameter, in order to pass the cbo switch and other estimation related > configurations in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18998) Add a cbo conf to switch between default statistics and cbo estimated statistics
[ https://issues.apache.org/jira/browse/SPARK-18998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15777679#comment-15777679 ] Apache Spark commented on SPARK-18998: -- User 'wzhfy' has created a pull request for this issue: https://github.com/apache/spark/pull/16401 > Add a cbo conf to switch between default statistics and cbo estimated > statistics > > > Key: SPARK-18998 > URL: https://issues.apache.org/jira/browse/SPARK-18998 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 2.2.0 >Reporter: Zhenhua Wang > > We need a cbo configuration to switch between default stats and estimated > stats. We also need a new statistics method in LogicalPlan with conf as its > parameter, in order to pass the cbo switch and other estimation related > configurations in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-18998) Add a cbo conf to switch between default statistics and cbo estimated statistics
Zhenhua Wang created SPARK-18998: Summary: Add a cbo conf to switch between default statistics and cbo estimated statistics Key: SPARK-18998 URL: https://issues.apache.org/jira/browse/SPARK-18998 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 2.2.0 Reporter: Zhenhua Wang We need a cbo configuration to switch between default stats and estimated stats. We also need a new statistics method in LogicalPlan with conf as its parameter, in order to pass the cbo switch and other estimation related configurations in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-13857) Feature parity for ALS ML with MLLIB
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15777650#comment-15777650 ] Debasish Das edited comment on SPARK-13857 at 12/26/16 5:57 AM: item->item and user->user was done in an old PR I had...if there is interest I can resend it...nice to see how it compares with approximate nearest neighbor work from uber: https://github.com/apache/spark/pull/6213 was (Author: debasish83): item->item and user->user was done in an old PR I had...if there is interested I can resend it...nice to see how it compares with approximate nearest neighbor work from uber: https://github.com/apache/spark/pull/6213 > Feature parity for ALS ML with MLLIB > > > Key: SPARK-13857 > URL: https://issues.apache.org/jira/browse/SPARK-13857 > Project: Spark > Issue Type: Sub-task > Components: ML >Reporter: Nick Pentreath >Assignee: Nick Pentreath > > Currently {{mllib.recommendation.MatrixFactorizationModel}} has methods > {{recommendProducts/recommendUsers}} for recommending top K to a given user / > item, as well as {{recommendProductsForUsers/recommendUsersForProducts}} to > recommend top K across all users/items. > Additionally, SPARK-10802 is for adding the ability to do > {{recommendProductsForUsers}} for a subset of users (or vice versa). > Look at exposing or porting (as appropriate) these methods to ALS in ML. > Investigate if efficiency can be improved at the same time (see SPARK-11968). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13857) Feature parity for ALS ML with MLLIB
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15777650#comment-15777650 ] Debasish Das commented on SPARK-13857: -- item->item and user->user was done in an old PR I had...if there is interested I can resend it...nice to see how it compares with approximate nearest neighbor work from uber: https://github.com/apache/spark/pull/6213 > Feature parity for ALS ML with MLLIB > > > Key: SPARK-13857 > URL: https://issues.apache.org/jira/browse/SPARK-13857 > Project: Spark > Issue Type: Sub-task > Components: ML >Reporter: Nick Pentreath >Assignee: Nick Pentreath > > Currently {{mllib.recommendation.MatrixFactorizationModel}} has methods > {{recommendProducts/recommendUsers}} for recommending top K to a given user / > item, as well as {{recommendProductsForUsers/recommendUsersForProducts}} to > recommend top K across all users/items. > Additionally, SPARK-10802 is for adding the ability to do > {{recommendProductsForUsers}} for a subset of users (or vice versa). > Look at exposing or porting (as appropriate) these methods to ALS in ML. > Investigate if efficiency can be improved at the same time (see SPARK-11968). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18941) "Drop Table" command doesn't delete the directory of the managed Hive table when users specifying locations
[ https://issues.apache.org/jira/browse/SPARK-18941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15777600#comment-15777600 ] Apache Spark commented on SPARK-18941: -- User 'dongjoon-hyun' has created a pull request for this issue: https://github.com/apache/spark/pull/16400 > "Drop Table" command doesn't delete the directory of the managed Hive table > when users specifying locations > --- > > Key: SPARK-18941 > URL: https://issues.apache.org/jira/browse/SPARK-18941 > Project: Spark > Issue Type: Documentation > Components: Java API >Affects Versions: 2.0.2 >Reporter: luat > > Spark thrift server, Spark 2.0.2, The "drop table" command doesn't delete the > directory associated with the Hive table (not EXTERNAL table) from the HDFS > file system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-18941) "Drop Table" command doesn't delete the directory of the managed Hive table when users specifying locations
[ https://issues.apache.org/jira/browse/SPARK-18941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18941: Assignee: Apache Spark > "Drop Table" command doesn't delete the directory of the managed Hive table > when users specifying locations > --- > > Key: SPARK-18941 > URL: https://issues.apache.org/jira/browse/SPARK-18941 > Project: Spark > Issue Type: Documentation > Components: Java API >Affects Versions: 2.0.2 >Reporter: luat >Assignee: Apache Spark > > Spark thrift server, Spark 2.0.2, The "drop table" command doesn't delete the > directory associated with the Hive table (not EXTERNAL table) from the HDFS > file system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-18941) "Drop Table" command doesn't delete the directory of the managed Hive table when users specifying locations
[ https://issues.apache.org/jira/browse/SPARK-18941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18941: Assignee: (was: Apache Spark) > "Drop Table" command doesn't delete the directory of the managed Hive table > when users specifying locations > --- > > Key: SPARK-18941 > URL: https://issues.apache.org/jira/browse/SPARK-18941 > Project: Spark > Issue Type: Documentation > Components: Java API >Affects Versions: 2.0.2 >Reporter: luat > > Spark thrift server, Spark 2.0.2, The "drop table" command doesn't delete the > directory associated with the Hive table (not EXTERNAL table) from the HDFS > file system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-12613) Elimination of Outer Join by Parent Join Condition
[ https://issues.apache.org/jira/browse/SPARK-12613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-12613. - Resolution: Duplicate > Elimination of Outer Join by Parent Join Condition > -- > > Key: SPARK-12613 > URL: https://issues.apache.org/jira/browse/SPARK-12613 > Project: Spark > Issue Type: Improvement > Components: Optimizer, SQL >Affects Versions: 1.6.0 >Reporter: Xiao Li >Priority: Critical > > Given an outer join is involved in another join (called parent join), when > the join type of the parent join is inner, left-semi, left-outer and > right-outer, checking if the join condition of the parent join satisfies the > following two conditions: > 1) there exist null filtering predicates against the columns in the > null-supplying side of parent join. > 2) these columns are from the child join. > If having such join predicates, execute the elimination rules: > - full outer -> inner if both sides of the child join have such predicates > - left outer -> inner if the right side of the child join has such predicates > - right outer -> inner if the left side of the child join has such predicates > - full outer -> left outer if only the left side of the child join has such > predicates > - full outer -> right outer if only the right side of the child join has > such predicates -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18941) "Drop Table" command doesn't delete the directory of the managed Hive table when users specifying locations
[ https://issues.apache.org/jira/browse/SPARK-18941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15777553#comment-15777553 ] Dongjoon Hyun commented on SPARK-18941: --- Yep. I'll make a PR soon. > "Drop Table" command doesn't delete the directory of the managed Hive table > when users specifying locations > --- > > Key: SPARK-18941 > URL: https://issues.apache.org/jira/browse/SPARK-18941 > Project: Spark > Issue Type: Documentation > Components: Java API >Affects Versions: 2.0.2 >Reporter: luat > > Spark thrift server, Spark 2.0.2, The "drop table" command doesn't delete the > directory associated with the Hive table (not EXTERNAL table) from the HDFS > file system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18931) Create empty staging directory in partitioned table on insert
[ https://issues.apache.org/jira/browse/SPARK-18931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15777543#comment-15777543 ] Xiao Li commented on SPARK-18931: - Yeah. The PR https://github.com/apache/spark/pull/16399 backports the fix to Spark 2.0. > Create empty staging directory in partitioned table on insert > - > > Key: SPARK-18931 > URL: https://issues.apache.org/jira/browse/SPARK-18931 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2 >Reporter: Egor Pahomov > > CREATE TABLE temp.test_partitioning_4 ( > num string > ) > PARTITIONED BY ( > day string) > stored as parquet > On every > INSERT INTO TABLE temp.test_partitioning_4 PARTITION (day) > select day, count(*) as num from > hss.session where year=2016 and month=4 > group by day > new directory > ".hive-staging_hive_2016-12-19_15-55-11_298_3412488541559534475-4" created on > HDFS. It's big issue, because I insert every day and bunch of empty dirs on > HDFS is very bad for HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18941) "Drop Table" command doesn't delete the directory of the managed Hive table when users specifying locations
[ https://issues.apache.org/jira/browse/SPARK-18941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-18941: Summary: "Drop Table" command doesn't delete the directory of the managed Hive table when users specifying locations (was: Spark thrift server, Spark 2.0.2, The "drop table" command doesn't delete the directory associated with the Hive table (not EXTERNAL table) from the HDFS file system) > "Drop Table" command doesn't delete the directory of the managed Hive table > when users specifying locations > --- > > Key: SPARK-18941 > URL: https://issues.apache.org/jira/browse/SPARK-18941 > Project: Spark > Issue Type: Bug > Components: Java API >Affects Versions: 2.0.2 >Reporter: luat > > Spark thrift server, Spark 2.0.2, The "drop table" command doesn't delete the > directory associated with the Hive table (not EXTERNAL table) from the HDFS > file system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-18941) "Drop Table" command doesn't delete the directory of the managed Hive table when users specifying locations
[ https://issues.apache.org/jira/browse/SPARK-18941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-18941: Issue Type: Documentation (was: Bug) > "Drop Table" command doesn't delete the directory of the managed Hive table > when users specifying locations > --- > > Key: SPARK-18941 > URL: https://issues.apache.org/jira/browse/SPARK-18941 > Project: Spark > Issue Type: Documentation > Components: Java API >Affects Versions: 2.0.2 >Reporter: luat > > Spark thrift server, Spark 2.0.2, The "drop table" command doesn't delete the > directory associated with the Hive table (not EXTERNAL table) from the HDFS > file system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18941) Spark thrift server, Spark 2.0.2, The "drop table" command doesn't delete the directory associated with the Hive table (not EXTERNAL table) from the HDFS file system
[ https://issues.apache.org/jira/browse/SPARK-18941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15777493#comment-15777493 ] Xiao Li commented on SPARK-18941: - We should document the behavior change. Maybe [~dongjoon] can submit a PR? > Spark thrift server, Spark 2.0.2, The "drop table" command doesn't delete the > directory associated with the Hive table (not EXTERNAL table) from the HDFS > file system > - > > Key: SPARK-18941 > URL: https://issues.apache.org/jira/browse/SPARK-18941 > Project: Spark > Issue Type: Bug > Components: Java API >Affects Versions: 2.0.2 >Reporter: luat > > Spark thrift server, Spark 2.0.2, The "drop table" command doesn't delete the > directory associated with the Hive table (not EXTERNAL table) from the HDFS > file system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-18997) Recommended upgrade libthrift to 0.9.3
meiyoula created SPARK-18997: Summary: Recommended upgrade libthrift to 0.9.3 Key: SPARK-18997 URL: https://issues.apache.org/jira/browse/SPARK-18997 Project: Spark Issue Type: Bug Components: Build Reporter: meiyoula Priority: Critical libthrift 0.9.2 has a serious security vulnerability:CVE-2015-3254 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18703) Insertion/CTAS against Hive Tables: Staging Directories and Data Files Not Dropped Until Normal Termination of JVM
[ https://issues.apache.org/jira/browse/SPARK-18703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15777332#comment-15777332 ] Apache Spark commented on SPARK-18703: -- User 'gatorsmile' has created a pull request for this issue: https://github.com/apache/spark/pull/16399 > Insertion/CTAS against Hive Tables: Staging Directories and Data Files Not > Dropped Until Normal Termination of JVM > -- > > Key: SPARK-18703 > URL: https://issues.apache.org/jira/browse/SPARK-18703 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2 >Reporter: Xiao Li >Assignee: Xiao Li >Priority: Critical > Fix For: 2.1.1, 2.2.0 > > > Below are the files/directories generated for three inserts againsts a Hive > table: > {noformat} > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-29_149_4298858301766472202-1 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-29_149_4298858301766472202-1/-ext-1 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-29_149_4298858301766472202-1/-ext-1/._SUCCESS.crc > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-29_149_4298858301766472202-1/-ext-1/.part-0.crc > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-29_149_4298858301766472202-1/-ext-1/_SUCCESS > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-29_149_4298858301766472202-1/-ext-1/part-0 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_454_6445008511655931341-1 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_454_6445008511655931341-1/-ext-1 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_454_6445008511655931341-1/-ext-1/._SUCCESS.crc > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_454_6445008511655931341-1/-ext-1/.part-0.crc > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_454_6445008511655931341-1/-ext-1/_SUCCESS > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_454_6445008511655931341-1/-ext-1/part-0 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_722_3388423608658711001-1 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_722_3388423608658711001-1/-ext-1 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_722_3388423608658711001-1/-ext-1/._SUCCESS.crc > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_722_3388423608658711001-1/-ext-1/.part-0.crc > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_722_3388423608658711001-1/-ext-1/_SUCCESS > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.hive-staging_hive_2016-12-03_20-56-30_722_3388423608658711001-1/-ext-1/part-0 > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.part-0.crc > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/part-0 > {noformat} > The first 18 files are temporary. We do not drop it until the end of JVM > termination. If JVM does not appropriately terminate, these temporary > files/directories will not be dropped. > Only the last two files are needed, as shown below. > {noformat} > /private/var/folders/4b/sgmfldk15js406vk7lw5llzwgn/T/spark-41eaa5ce-0288-471e-bba1-09cc482813ff/.part-0.crc > /pr
[jira] [Commented] (SPARK-18237) hive.exec.stagingdir have no effect in spark2.0.1
[ https://issues.apache.org/jira/browse/SPARK-18237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15777331#comment-15777331 ] Apache Spark commented on SPARK-18237: -- User 'gatorsmile' has created a pull request for this issue: https://github.com/apache/spark/pull/16399 > hive.exec.stagingdir have no effect in spark2.0.1 > - > > Key: SPARK-18237 > URL: https://issues.apache.org/jira/browse/SPARK-18237 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.1 >Reporter: ClassNotFoundExp >Assignee: ClassNotFoundExp > Fix For: 2.1.0 > > > hive.exec.stagingdir have no effect in spark2.0.1, > this relevant to https://issues.apache.org/jira/browse/SPARK-11021 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18675) CTAS for hive serde table should work for all hive versions
[ https://issues.apache.org/jira/browse/SPARK-18675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15777333#comment-15777333 ] Apache Spark commented on SPARK-18675: -- User 'gatorsmile' has created a pull request for this issue: https://github.com/apache/spark/pull/16399 > CTAS for hive serde table should work for all hive versions > --- > > Key: SPARK-18675 > URL: https://issues.apache.org/jira/browse/SPARK-18675 > Project: Spark > Issue Type: Bug > Components: SQL >Reporter: Wenchen Fan >Assignee: Wenchen Fan > Fix For: 2.1.1, 2.2.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-18842) De-duplicate paths in classpaths in processes for local-cluster mode to work around the length limitation on Windows
[ https://issues.apache.org/jira/browse/SPARK-18842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18842: Assignee: Apache Spark (was: Hyukjin Kwon) > De-duplicate paths in classpaths in processes for local-cluster mode to work > around the length limitation on Windows > > > Key: SPARK-18842 > URL: https://issues.apache.org/jira/browse/SPARK-18842 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, Tests >Reporter: Hyukjin Kwon >Assignee: Apache Spark > Fix For: 2.2.0 > > > Currently, some tests are being failed and hanging on Windows due to this > problem. For the reason in SPARK-18718, some tests using {{local-cluster}} > mode were disabled on Windows due to the length limitation by paths given to > classpaths. > The limitation seems roughly 32K (see > https://blogs.msdn.microsoft.com/oldnewthing/20031210-00/?p=41553/ and > https://support.thoughtworks.com/hc/en-us/articles/213248526-Getting-around-maximum-command-line-length-is-32767-characters-on-Windows) > but executors were being launched with the command such as > https://gist.github.com/HyukjinKwon/5bc81061c250d4af5a180869b59d42ea in > (only) tests. > This length is roughly 40K due to the class paths. However, it seems there > are duplicates more than half. So, if we de-duplicate this paths, it is > reduced to roughly 20K. > Maybe, we should consider as some more paths are added in the future but it > seems better than disabling all the tests for now with minimised changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-18842) De-duplicate paths in classpaths in processes for local-cluster mode to work around the length limitation on Windows
[ https://issues.apache.org/jira/browse/SPARK-18842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18842: Assignee: Hyukjin Kwon (was: Apache Spark) > De-duplicate paths in classpaths in processes for local-cluster mode to work > around the length limitation on Windows > > > Key: SPARK-18842 > URL: https://issues.apache.org/jira/browse/SPARK-18842 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, Tests >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon > Fix For: 2.2.0 > > > Currently, some tests are being failed and hanging on Windows due to this > problem. For the reason in SPARK-18718, some tests using {{local-cluster}} > mode were disabled on Windows due to the length limitation by paths given to > classpaths. > The limitation seems roughly 32K (see > https://blogs.msdn.microsoft.com/oldnewthing/20031210-00/?p=41553/ and > https://support.thoughtworks.com/hc/en-us/articles/213248526-Getting-around-maximum-command-line-length-is-32767-characters-on-Windows) > but executors were being launched with the command such as > https://gist.github.com/HyukjinKwon/5bc81061c250d4af5a180869b59d42ea in > (only) tests. > This length is roughly 40K due to the class paths. However, it seems there > are duplicates more than half. So, if we de-duplicate this paths, it is > reduced to roughly 20K. > Maybe, we should consider as some more paths are added in the future but it > seems better than disabling all the tests for now with minimised changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18842) De-duplicate paths in classpaths in processes for local-cluster mode to work around the length limitation on Windows
[ https://issues.apache.org/jira/browse/SPARK-18842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15776477#comment-15776477 ] Apache Spark commented on SPARK-18842: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/16398 > De-duplicate paths in classpaths in processes for local-cluster mode to work > around the length limitation on Windows > > > Key: SPARK-18842 > URL: https://issues.apache.org/jira/browse/SPARK-18842 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, Tests >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon > Fix For: 2.2.0 > > > Currently, some tests are being failed and hanging on Windows due to this > problem. For the reason in SPARK-18718, some tests using {{local-cluster}} > mode were disabled on Windows due to the length limitation by paths given to > classpaths. > The limitation seems roughly 32K (see > https://blogs.msdn.microsoft.com/oldnewthing/20031210-00/?p=41553/ and > https://support.thoughtworks.com/hc/en-us/articles/213248526-Getting-around-maximum-command-line-length-is-32767-characters-on-Windows) > but executors were being launched with the command such as > https://gist.github.com/HyukjinKwon/5bc81061c250d4af5a180869b59d42ea in > (only) tests. > This length is roughly 40K due to the class paths. However, it seems there > are duplicates more than half. So, if we de-duplicate this paths, it is > reduced to roughly 20K. > Maybe, we should consider as some more paths are added in the future but it > seems better than disabling all the tests for now with minimised changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18842) De-duplicate paths in classpaths in processes for local-cluster mode to work around the length limitation on Windows
[ https://issues.apache.org/jira/browse/SPARK-18842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15776461#comment-15776461 ] Hyukjin Kwon edited comment on SPARK-18842 at 12/25/16 1:19 PM: {{ReplSuite}} hangs on Windows due to this problem. The reason is, it uses the paths as URLs in the tests whereas some added afterward are normal local paths. So, many paths are duplicated because normal local paths and URLs are mixed. This length is up to 40K which hits the length limitation problem on Windows. Please refer the tests here - https://ci.appveyor.com/project/spark-test/spark/build/395-find-path-issues and the command line here - https://gist.github.com/HyukjinKwon/46af7946c9a5fd4c6fc70a8a0aba1beb was (Author: hyukjin.kwon): {{ReplSuite}} hangs on Windows due to this problem. The reason is, it converts the paths into URL in the tests. So, many paths are duplicated because normal local paths and URLs are mixed. Please refer the tests here - https://ci.appveyor.com/project/spark-test/spark/build/395-find-path-issues and the command line here - https://gist.github.com/HyukjinKwon/46af7946c9a5fd4c6fc70a8a0aba1beb > De-duplicate paths in classpaths in processes for local-cluster mode to work > around the length limitation on Windows > > > Key: SPARK-18842 > URL: https://issues.apache.org/jira/browse/SPARK-18842 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, Tests >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon > Fix For: 2.2.0 > > > Currently, some tests are being failed and hanging on Windows due to this > problem. For the reason in SPARK-18718, some tests using {{local-cluster}} > mode were disabled on Windows due to the length limitation by paths given to > classpaths. > The limitation seems roughly 32K (see > https://blogs.msdn.microsoft.com/oldnewthing/20031210-00/?p=41553/ and > https://support.thoughtworks.com/hc/en-us/articles/213248526-Getting-around-maximum-command-line-length-is-32767-characters-on-Windows) > but executors were being launched with the command such as > https://gist.github.com/HyukjinKwon/5bc81061c250d4af5a180869b59d42ea in > (only) tests. > This length is roughly 40K due to the class paths. However, it seems there > are duplicates more than half. So, if we de-duplicate this paths, it is > reduced to roughly 20K. > Maybe, we should consider as some more paths are added in the future but it > seems better than disabling all the tests for now with minimised changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-18842) De-duplicate paths in classpaths in processes for local-cluster mode to work around the length limitation on Windows
[ https://issues.apache.org/jira/browse/SPARK-18842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-18842: -- {{ReplSuite}} hangs on Windows due to this problem. The reason is, it converts the paths into URL in the tests. So, many paths are duplicated because normal local paths and URLs are mixed. Please refer the tests here - https://ci.appveyor.com/project/spark-test/spark/build/395-find-path-issues and the command line here - https://gist.github.com/HyukjinKwon/46af7946c9a5fd4c6fc70a8a0aba1beb > De-duplicate paths in classpaths in processes for local-cluster mode to work > around the length limitation on Windows > > > Key: SPARK-18842 > URL: https://issues.apache.org/jira/browse/SPARK-18842 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, Tests >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon > Fix For: 2.2.0 > > > Currently, some tests are being failed and hanging on Windows due to this > problem. For the reason in SPARK-18718, some tests using {{local-cluster}} > mode were disabled on Windows due to the length limitation by paths given to > classpaths. > The limitation seems roughly 32K (see > https://blogs.msdn.microsoft.com/oldnewthing/20031210-00/?p=41553/ and > https://support.thoughtworks.com/hc/en-us/articles/213248526-Getting-around-maximum-command-line-length-is-32767-characters-on-Windows) > but executors were being launched with the command such as > https://gist.github.com/HyukjinKwon/5bc81061c250d4af5a180869b59d42ea in > (only) tests. > This length is roughly 40K due to the class paths. However, it seems there > are duplicates more than half. So, if we de-duplicate this paths, it is > reduced to roughly 20K. > Maybe, we should consider as some more paths are added in the future but it > seems better than disabling all the tests for now with minimised changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-18922) Fix more resource-closing-related and path-related test failures in identified ones on Windows
[ https://issues.apache.org/jira/browse/SPARK-18922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18922: Assignee: Hyukjin Kwon (was: Apache Spark) > Fix more resource-closing-related and path-related test failures in > identified ones on Windows > -- > > Key: SPARK-18922 > URL: https://issues.apache.org/jira/browse/SPARK-18922 > Project: Spark > Issue Type: Sub-task > Components: Tests >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Minor > Fix For: 2.2.0 > > > There are more instances that are failed on Windows as below: > - {{LauncherBackendSuite}}: > {code} > - local: launcher handle *** FAILED *** (30 seconds, 120 milliseconds) > The code passed to eventually never returned normally. Attempted 283 times > over 30.0960053 seconds. Last failure message: The reference was null. > (LauncherBackendSuite.scala:56) > org.scalatest.exceptions.TestFailedDueToTimeoutException: > at > org.scalatest.concurrent.Eventually$class.tryTryAgain$1(Eventually.scala:420) > at > org.scalatest.concurrent.Eventually$class.eventually(Eventually.scala:438) > - standalone/client: launcher handle *** FAILED *** (30 seconds, 47 > milliseconds) > The code passed to eventually never returned normally. Attempted 282 times > over 30.03798710002 seconds. Last failure message: The reference was > null. (LauncherBackendSuite.scala:56) > org.scalatest.exceptions.TestFailedDueToTimeoutException: > at > org.scalatest.concurrent.Eventually$class.tryTryAgain$1(Eventually.scala:420) > at > org.scalatest.concurrent.Eventually$class.eventually(Eventually.scala:438) > {code} > - {{SQLQuerySuite}}: > {code} > - specifying database name for a temporary table is not allowed *** FAILED > *** (125 milliseconds) > org.apache.spark.sql.AnalysisException: Path does not exist: > file:/C:projectsspark arget mpspark-1f4471ab-aac0-4239-ae35-833d54b37e52; > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:382) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:370) > {code} > - {{JsonSuite}}: > {code} > - Loading a JSON dataset from a text file with SQL *** FAILED *** (94 > milliseconds) > org.apache.spark.sql.AnalysisException: Path does not exist: > file:/C:projectsspark arget mpspark-c918a8b7-fc09-433c-b9d0-36c0f78ae918; > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:382) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:370) > {code} > - {{StateStoreSuite}}: > {code} > - SPARK-18342: commit fails when rename fails *** FAILED *** (16 milliseconds) > java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative > path in absolute URI: > StateStoreSuite29777261fs://C:%5Cprojects%5Cspark%5Ctarget%5Ctmp%5Cspark-ef349862-7281-4963-aaf3-add0d670a4ad%5C?-2218c2f8-2cf6-4f80-9cdf-96354e8246a77685899733421033312/0 > at org.apache.hadoop.fs.Path.initialize(Path.java:206) > at org.apache.hadoop.fs.Path.(Path.java:116) > at org.apache.hadoop.fs.Path.(Path.java:89) > ... > Cause: java.net.URISyntaxException: Relative path in absolute URI: > StateStoreSuite29777261fs://C:%5Cprojects%5Cspark%5Ctarget%5Ctmp%5Cspark-ef349862-7281-4963-aaf3-add0d670a4ad%5C?-2218c2f8-2cf6-4f80-9cdf-96354e8246a77685899733421033312/0 > at java.net.URI.checkPath(URI.java:1823) > at java.net.URI.(URI.java:745) > at org.apache.hadoop.fs.Path.initialize(Path.java:203) > {code} > - {{HDFSMetadataLogSuite}}: > {code} > - FileManager: FileContextManager *** FAILED *** (94 milliseconds) > java.io.IOException: Failed to delete: > C:\projects\spark\target\tmp\spark-415bb0bd-396b-444d-be82-04599e025f21 > at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1010) > at > org.apache.spark.sql.test.SQLTestUtils$class.withTempDir(SQLTestUtils.scala:127) > at > org.apache.spark.sql.execution.streaming.HDFSMetadataLogSuite.withTempDir(HDFSMetadataLogSuite.scala:38) > - FileManager: FileSystemManager *** FAILED *** (78 milliseconds) > java.io.IOException: Failed to delete: > C:\projects\spark\target\tmp\spark-ef8222cd-85aa-47c0-a396-bc7979e15088 > at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1010) > at > org.apache.spark.sql.test.SQLTestUtils$class.withTempDir(SQLTestUtils.scala:127) > at > org.apache.spark.sql.execution.streaming.HDFSMetadataLogSuite.withTempDir(HDFSMetadataLogSuite.scala:38) > {code} > Please refer, for full logs, > https://ci.appveyor.com/project/spark-test/spark/build/283-tmp-test-base -- This message was sent by Atlassian JIRA (v6.3.4#6332) -
[jira] [Commented] (SPARK-18922) Fix more resource-closing-related and path-related test failures in identified ones on Windows
[ https://issues.apache.org/jira/browse/SPARK-18922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15776431#comment-15776431 ] Apache Spark commented on SPARK-18922: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/16397 > Fix more resource-closing-related and path-related test failures in > identified ones on Windows > -- > > Key: SPARK-18922 > URL: https://issues.apache.org/jira/browse/SPARK-18922 > Project: Spark > Issue Type: Sub-task > Components: Tests >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Minor > Fix For: 2.2.0 > > > There are more instances that are failed on Windows as below: > - {{LauncherBackendSuite}}: > {code} > - local: launcher handle *** FAILED *** (30 seconds, 120 milliseconds) > The code passed to eventually never returned normally. Attempted 283 times > over 30.0960053 seconds. Last failure message: The reference was null. > (LauncherBackendSuite.scala:56) > org.scalatest.exceptions.TestFailedDueToTimeoutException: > at > org.scalatest.concurrent.Eventually$class.tryTryAgain$1(Eventually.scala:420) > at > org.scalatest.concurrent.Eventually$class.eventually(Eventually.scala:438) > - standalone/client: launcher handle *** FAILED *** (30 seconds, 47 > milliseconds) > The code passed to eventually never returned normally. Attempted 282 times > over 30.03798710002 seconds. Last failure message: The reference was > null. (LauncherBackendSuite.scala:56) > org.scalatest.exceptions.TestFailedDueToTimeoutException: > at > org.scalatest.concurrent.Eventually$class.tryTryAgain$1(Eventually.scala:420) > at > org.scalatest.concurrent.Eventually$class.eventually(Eventually.scala:438) > {code} > - {{SQLQuerySuite}}: > {code} > - specifying database name for a temporary table is not allowed *** FAILED > *** (125 milliseconds) > org.apache.spark.sql.AnalysisException: Path does not exist: > file:/C:projectsspark arget mpspark-1f4471ab-aac0-4239-ae35-833d54b37e52; > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:382) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:370) > {code} > - {{JsonSuite}}: > {code} > - Loading a JSON dataset from a text file with SQL *** FAILED *** (94 > milliseconds) > org.apache.spark.sql.AnalysisException: Path does not exist: > file:/C:projectsspark arget mpspark-c918a8b7-fc09-433c-b9d0-36c0f78ae918; > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:382) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:370) > {code} > - {{StateStoreSuite}}: > {code} > - SPARK-18342: commit fails when rename fails *** FAILED *** (16 milliseconds) > java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative > path in absolute URI: > StateStoreSuite29777261fs://C:%5Cprojects%5Cspark%5Ctarget%5Ctmp%5Cspark-ef349862-7281-4963-aaf3-add0d670a4ad%5C?-2218c2f8-2cf6-4f80-9cdf-96354e8246a77685899733421033312/0 > at org.apache.hadoop.fs.Path.initialize(Path.java:206) > at org.apache.hadoop.fs.Path.(Path.java:116) > at org.apache.hadoop.fs.Path.(Path.java:89) > ... > Cause: java.net.URISyntaxException: Relative path in absolute URI: > StateStoreSuite29777261fs://C:%5Cprojects%5Cspark%5Ctarget%5Ctmp%5Cspark-ef349862-7281-4963-aaf3-add0d670a4ad%5C?-2218c2f8-2cf6-4f80-9cdf-96354e8246a77685899733421033312/0 > at java.net.URI.checkPath(URI.java:1823) > at java.net.URI.(URI.java:745) > at org.apache.hadoop.fs.Path.initialize(Path.java:203) > {code} > - {{HDFSMetadataLogSuite}}: > {code} > - FileManager: FileContextManager *** FAILED *** (94 milliseconds) > java.io.IOException: Failed to delete: > C:\projects\spark\target\tmp\spark-415bb0bd-396b-444d-be82-04599e025f21 > at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1010) > at > org.apache.spark.sql.test.SQLTestUtils$class.withTempDir(SQLTestUtils.scala:127) > at > org.apache.spark.sql.execution.streaming.HDFSMetadataLogSuite.withTempDir(HDFSMetadataLogSuite.scala:38) > - FileManager: FileSystemManager *** FAILED *** (78 milliseconds) > java.io.IOException: Failed to delete: > C:\projects\spark\target\tmp\spark-ef8222cd-85aa-47c0-a396-bc7979e15088 > at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1010) > at > org.apache.spark.sql.test.SQLTestUtils$class.withTempDir(SQLTestUtils.scala:127) > at > org.apache.spark.sql.execution.streaming.HDFSMetadataLogSuite.withTempDir(HDFSMetadataLogSuite.scala:38) > {code} > Please refer, for full logs, > https://ci.appveyor.com/project/spark-t
[jira] [Assigned] (SPARK-18922) Fix more resource-closing-related and path-related test failures in identified ones on Windows
[ https://issues.apache.org/jira/browse/SPARK-18922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18922: Assignee: Apache Spark (was: Hyukjin Kwon) > Fix more resource-closing-related and path-related test failures in > identified ones on Windows > -- > > Key: SPARK-18922 > URL: https://issues.apache.org/jira/browse/SPARK-18922 > Project: Spark > Issue Type: Sub-task > Components: Tests >Reporter: Hyukjin Kwon >Assignee: Apache Spark >Priority: Minor > Fix For: 2.2.0 > > > There are more instances that are failed on Windows as below: > - {{LauncherBackendSuite}}: > {code} > - local: launcher handle *** FAILED *** (30 seconds, 120 milliseconds) > The code passed to eventually never returned normally. Attempted 283 times > over 30.0960053 seconds. Last failure message: The reference was null. > (LauncherBackendSuite.scala:56) > org.scalatest.exceptions.TestFailedDueToTimeoutException: > at > org.scalatest.concurrent.Eventually$class.tryTryAgain$1(Eventually.scala:420) > at > org.scalatest.concurrent.Eventually$class.eventually(Eventually.scala:438) > - standalone/client: launcher handle *** FAILED *** (30 seconds, 47 > milliseconds) > The code passed to eventually never returned normally. Attempted 282 times > over 30.03798710002 seconds. Last failure message: The reference was > null. (LauncherBackendSuite.scala:56) > org.scalatest.exceptions.TestFailedDueToTimeoutException: > at > org.scalatest.concurrent.Eventually$class.tryTryAgain$1(Eventually.scala:420) > at > org.scalatest.concurrent.Eventually$class.eventually(Eventually.scala:438) > {code} > - {{SQLQuerySuite}}: > {code} > - specifying database name for a temporary table is not allowed *** FAILED > *** (125 milliseconds) > org.apache.spark.sql.AnalysisException: Path does not exist: > file:/C:projectsspark arget mpspark-1f4471ab-aac0-4239-ae35-833d54b37e52; > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:382) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:370) > {code} > - {{JsonSuite}}: > {code} > - Loading a JSON dataset from a text file with SQL *** FAILED *** (94 > milliseconds) > org.apache.spark.sql.AnalysisException: Path does not exist: > file:/C:projectsspark arget mpspark-c918a8b7-fc09-433c-b9d0-36c0f78ae918; > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:382) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:370) > {code} > - {{StateStoreSuite}}: > {code} > - SPARK-18342: commit fails when rename fails *** FAILED *** (16 milliseconds) > java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative > path in absolute URI: > StateStoreSuite29777261fs://C:%5Cprojects%5Cspark%5Ctarget%5Ctmp%5Cspark-ef349862-7281-4963-aaf3-add0d670a4ad%5C?-2218c2f8-2cf6-4f80-9cdf-96354e8246a77685899733421033312/0 > at org.apache.hadoop.fs.Path.initialize(Path.java:206) > at org.apache.hadoop.fs.Path.(Path.java:116) > at org.apache.hadoop.fs.Path.(Path.java:89) > ... > Cause: java.net.URISyntaxException: Relative path in absolute URI: > StateStoreSuite29777261fs://C:%5Cprojects%5Cspark%5Ctarget%5Ctmp%5Cspark-ef349862-7281-4963-aaf3-add0d670a4ad%5C?-2218c2f8-2cf6-4f80-9cdf-96354e8246a77685899733421033312/0 > at java.net.URI.checkPath(URI.java:1823) > at java.net.URI.(URI.java:745) > at org.apache.hadoop.fs.Path.initialize(Path.java:203) > {code} > - {{HDFSMetadataLogSuite}}: > {code} > - FileManager: FileContextManager *** FAILED *** (94 milliseconds) > java.io.IOException: Failed to delete: > C:\projects\spark\target\tmp\spark-415bb0bd-396b-444d-be82-04599e025f21 > at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1010) > at > org.apache.spark.sql.test.SQLTestUtils$class.withTempDir(SQLTestUtils.scala:127) > at > org.apache.spark.sql.execution.streaming.HDFSMetadataLogSuite.withTempDir(HDFSMetadataLogSuite.scala:38) > - FileManager: FileSystemManager *** FAILED *** (78 milliseconds) > java.io.IOException: Failed to delete: > C:\projects\spark\target\tmp\spark-ef8222cd-85aa-47c0-a396-bc7979e15088 > at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1010) > at > org.apache.spark.sql.test.SQLTestUtils$class.withTempDir(SQLTestUtils.scala:127) > at > org.apache.spark.sql.execution.streaming.HDFSMetadataLogSuite.withTempDir(HDFSMetadataLogSuite.scala:38) > {code} > Please refer, for full logs, > https://ci.appveyor.com/project/spark-test/spark/build/283-tmp-test-base -- This message was sent by Atlassian JIRA (v6.3.4#6332) -
[jira] [Reopened] (SPARK-18922) Fix more resource-closing-related and path-related test failures in identified ones on Windows
[ https://issues.apache.org/jira/browse/SPARK-18922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-18922: -- I am reopening this as I found some more errors as below: {code} ColumnExpressionSuite: - input_file_name, input_file_block_start, input_file_block_length - FileScanRDD *** FAILED *** (187 milliseconds) "file:///C:/projects/spark/target/tmp/spark-0b21b963-6cfa-411c-8d6f-e6a5e1e73bce/part-1-c083a03a-e55e-4b05-9073-451de352d006.snappy.parquet" did not contain "C:\projects\spark\target\tmp\spark-0b21b963-6cfa-411c-8d6f-e6a5e1e73bce" (ColumnExpressionSuite.scala:545) - input_file_name, input_file_block_start, input_file_block_length - HadoopRDD *** FAILED *** (172 milliseconds) "file:/C:/projects/spark/target/tmp/spark-5d0afa94-7c2f-463b-9db9-2e8403e2bc5f/part-0-f6530138-9ad3-466d-ab46-0eeb6f85ed0b.txt" did not contain "C:\projects\spark\target\tmp\spark-5d0afa94-7c2f-463b-9db9-2e8403e2bc5f" (ColumnExpressionSuite.scala:569) - input_file_name, input_file_block_start, input_file_block_length - NewHadoopRDD *** FAILED *** (156 milliseconds) "file:/C:/projects/spark/target/tmp/spark-a894c7df-c74d-4d19-82a2-a04744cb3766/part-0-29674e3f-3fcf-4327-9b04-4dab1d46338d.txt" did not contain "C:\projects\spark\target\tmp\spark-a894c7df-c74d-4d19-82a2-a04744cb3766" (ColumnExpressionSuite.scala:598) DataStreamReaderWriterSuite: - source metadataPath *** FAILED *** (62 milliseconds) org.mockito.exceptions.verification.junit.ArgumentsAreDifferent: Argument(s) are different! Wanted: streamSourceProvider.createSource( org.apache.spark.sql.SQLContext@3b04133b, "C:\projects\spark\target\tmp\streaming.metadata-b05db6ae-c8dc-4ce4-b0d9-1eb8c84876c0/sources/0", None, "org.apache.spark.sql.streaming.test", Map() ); -> at org.apache.spark.sql.streaming.test.DataStreamReaderWriterSuite$$anonfun$12.apply$mcV$sp(DataStreamReaderWriterSuite.scala:374) Actual invocation has different arguments: streamSourceProvider.createSource( org.apache.spark.sql.SQLContext@3b04133b, "/C:/projects/spark/target/tmp/streaming.metadata-b05db6ae-c8dc-4ce4-b0d9-1eb8c84876c0/sources/0", None, "org.apache.spark.sql.streaming.test", Map() ); GlobalTempViewSuite: - CREATE GLOBAL TEMP VIEW USING *** FAILED *** (110 milliseconds) org.apache.spark.sql.AnalysisException: Path does not exist: file:/C:projectsspark arget mpspark-960398ba-a0a1-45f6-a59a-d98533f9f519; CreateTableAsSelectSuite: - CREATE TABLE USING AS SELECT *** FAILED *** (0 milliseconds) java.lang.IllegalArgumentException: Can not create a Path from an empty string - create a table, drop it and create another one with the same name *** FAILED *** (16 milliseconds) java.lang.IllegalArgumentException: Can not create a Path from an empty string - create table using as select - with partitioned by *** FAILED *** (0 milliseconds) java.lang.IllegalArgumentException: Can not create a Path from an empty string - create table using as select - with non-zero buckets *** FAILED *** (0 milliseconds) java.lang.IllegalArgumentException: Can not create a Path from an empty string HiveMetadataCacheSuite: - partitioned table is cached when partition pruning is true *** FAILED *** (532 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); - partitioned table is cached when partition pruning is false *** FAILED *** (297 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); MultiDatabaseSuite: - createExternalTable() to non-default database - with USE *** FAILED *** (954 milliseconds) org.apache.spark.sql.AnalysisException: Path does not exist: file:/C:projectsspark arget mpspark-0839d9a7-5e29-467a-9e3e-3e4cd618ee09; - createExternalTable() to non-default database - without USE *** FAILED *** (500 milliseconds) org.apache.spark.sql.AnalysisException: Path does not exist: file:/C:projectsspark arget mpspark-c7e24d73-1d8f-45e8-ab7d-53a83087aec3; - invalid database name and table names *** FAILED *** (31 milliseconds) "Path does not exist: file:/C:projectsspark arget mpspark-15a2a494-3483-4876-80e5-ec396e704b77;" did not contain "`t:a` is not a valid name for tables/databases. Valid names only contain alphabet characters, numbers and _." (MultiDatabaseSuite.scala:296) OrcQuerySuite: - SPARK-8501: Avoids discovery schema from empty ORC files *** FAILED *** (15 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); - Verify th