[jira] [Created] (SPARK-11182) HDFS Delegation Token will be expired when calling "UserGroupInformation.getCurrentUser.addCredentials" in HA mode
Liangliang Gu created SPARK-11182: - Summary: HDFS Delegation Token will be expired when calling "UserGroupInformation.getCurrentUser.addCredentials" in HA mode Key: SPARK-11182 URL: https://issues.apache.org/jira/browse/SPARK-11182 Project: Spark Issue Type: Bug Components: YARN Reporter: Liangliang Gu In HA mode, DFSClient will generate HDFS Delegation Token for each Name Node automatically, which will not be updated when Spark update Credentials for the current user. Spark should update these tokens in order to avoid Token Expired Error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11182) HDFS Delegation Token will be expired when calling "UserGroupInformation.getCurrentUser.addCredentials" in HA mode
[ https://issues.apache.org/jira/browse/SPARK-11182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963091#comment-14963091 ] Liangliang Gu commented on SPARK-11182: --- https://github.com/apache/spark/pull/9168 > HDFS Delegation Token will be expired when calling > "UserGroupInformation.getCurrentUser.addCredentials" in HA mode > -- > > Key: SPARK-11182 > URL: https://issues.apache.org/jira/browse/SPARK-11182 > Project: Spark > Issue Type: Bug > Components: YARN >Reporter: Liangliang Gu > > In HA mode, DFSClient will generate HDFS Delegation Token for each Name Node > automatically, which will not be updated when Spark update Credentials for > the current user. > Spark should update these tokens in order to avoid Token Expired Error. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-8030) Spelling Mistake: 'fetchHcfsFile' should be 'fetchHdfsFile'
[ https://issues.apache.org/jira/browse/SPARK-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liangliang Gu closed SPARK-8030. Resolution: Invalid actually hcfs is intended. If you read the comments for that method you'll see that it refers to hadoop compatible file system Spelling Mistake: 'fetchHcfsFile' should be 'fetchHdfsFile' --- Key: SPARK-8030 URL: https://issues.apache.org/jira/browse/SPARK-8030 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Liangliang Gu Spelling Mistake in org.apache.spark.util.Utils: 'fetchHcfsFile' should be 'fetchHdfsFile' -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8030) Spelling Mistake: 'fetchHcfsFile' should be 'fetchHdfsFile'
[ https://issues.apache.org/jira/browse/SPARK-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14568578#comment-14568578 ] Liangliang Gu commented on SPARK-8030: -- https://github.com/apache/spark/pull/6575 Spelling Mistake: 'fetchHcfsFile' should be 'fetchHdfsFile' --- Key: SPARK-8030 URL: https://issues.apache.org/jira/browse/SPARK-8030 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Liangliang Gu Spelling Mistake in org.apache.spark.util.Utils: 'fetchHcfsFile' should be 'fetchHdfsFile' -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-8030) Spelling Mistake: 'fetchHcfsFile' should be 'fetchHdfsFile'
Liangliang Gu created SPARK-8030: Summary: Spelling Mistake: 'fetchHcfsFile' should be 'fetchHdfsFile' Key: SPARK-8030 URL: https://issues.apache.org/jira/browse/SPARK-8030 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Liangliang Gu Spelling Mistake in org.apache.spark.util.Utils: 'fetchHcfsFile' should be 'fetchHdfsFile' -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-6491) Spark will put the current working dir to the CLASSPATH
Liangliang Gu created SPARK-6491: Summary: Spark will put the current working dir to the CLASSPATH Key: SPARK-6491 URL: https://issues.apache.org/jira/browse/SPARK-6491 Project: Spark Issue Type: Bug Affects Versions: 1.3.0 Reporter: Liangliang Gu When running bin/computer-classpath.sh, the output will be: :/spark/conf:/spark/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.5.0-cdh5.2.0.jar:/spark/lib_managed/jars/datanucleus-rdbms-3.2.9.jar:/spark/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar:/spark/lib_managed/jars/datanucleus-core-3.2.10.jar Java will add the current working dir to the CLASSPATH, if the first : exists, which is not expected by spark users. For example, if I call spark-shell in the folder /root. And there exists a core-site.xml under /root/. Spark will use this file as HADOOP CONF file, even if I have already set HADOOP_CONF_DIR=/etc/hadoop/conf. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-6427) spark-sql does not throw error if running in yarn-cluster mode
[ https://issues.apache.org/jira/browse/SPARK-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liangliang Gu closed SPARK-6427. Resolution: Not a Problem spark-sql does not throw error if running in yarn-cluster mode -- Key: SPARK-6427 URL: https://issues.apache.org/jira/browse/SPARK-6427 Project: Spark Issue Type: Bug Reporter: Liangliang Gu Running spark-sql in yarn-cluster mode will not throw error. While running spark-shell in yarn-cluster mode, will get the following error: Error: Cluster deploy mode is not applicable to Spark shells. Run with --help for usage help or --verbose for debug output -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-6427) spark-sql does not throw error if running in yarn-cluster mode
Liangliang Gu created SPARK-6427: Summary: spark-sql does not throw error if running in yarn-cluster mode Key: SPARK-6427 URL: https://issues.apache.org/jira/browse/SPARK-6427 Project: Spark Issue Type: Bug Reporter: Liangliang Gu Running spark-sql in yarn-cluster mode will not throw error. While running spark-shell in yarn-cluster mode, will get the following error: Error: Cluster deploy mode is not applicable to Spark shells. Run with --help for usage help or --verbose for debug output -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-6420) Driver's Block Manager does not use spark.driver.host in Yarn-Client mode
Liangliang Gu created SPARK-6420: Summary: Driver's Block Manager does not use spark.driver.host in Yarn-Client mode Key: SPARK-6420 URL: https://issues.apache.org/jira/browse/SPARK-6420 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Liangliang Gu In my cluster, the yarn node does not know the client's host name. So I set spark.driver.host to the ip address of the client. But the driver's Block Manager does not use spark.driver.host but the hostname in Yarn-Client mode. I got the following error: TaskSetManager: Lost task 1.1 in stage 0.0 (TID 2, hadoop-node1538098): java.io.IOException: Failed to connect to example-hostname at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:191) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156) at org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:78) at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140) at org.apache.spark.network.shuffle.RetryingBlockFetcher.access$200(RetryingBlockFetcher.java:43) at org.apache.spark.network.shuffle.RetryingBlockFetcher$1.run(RetryingBlockFetcher.java:170) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.nio.channels.UnresolvedAddressException at sun.nio.ch.Net.checkAddress(Net.java:127) at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:644) at io.netty.channel.socket.nio.NioSocketChannel.doConnect(NioSocketChannel.java:193) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:200) at io.netty.channel.DefaultChannelPipeline$HeadContext.connect(DefaultChannelPipeline.java:1029) at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:496) at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:481) at io.netty.channel.ChannelOutboundHandlerAdapter.connect(ChannelOutboundHandlerAdapter.java:47) at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:496) at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:481) at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:463) at io.netty.channel.DefaultChannelPipeline.connect(DefaultChannelPipeline.java:849) at io.netty.channel.AbstractChannel.connect(AbstractChannel.java:199) at io.netty.bootstrap.Bootstrap$2.run(Bootstrap.java:165) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) ... 1 more -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-5771) Number of Cores in Completed Applications of Standalone Master Web Page always be 0 if sc.stop() is called
Liangliang Gu created SPARK-5771: Summary: Number of Cores in Completed Applications of Standalone Master Web Page always be 0 if sc.stop() is called Key: SPARK-5771 URL: https://issues.apache.org/jira/browse/SPARK-5771 Project: Spark Issue Type: Bug Components: Web UI Reporter: Liangliang Gu In Standalone mode, the number of cores in Completed Applications of the Master Web Page will always be zero, if sc.stop() is called. But the number will always be right, if sc.stop() is not called. The reason maybe: after sc.stop() is called, the function removeExecutor of class ApplicationInfo will be called, thus reduce the variable coresGranted to zero. The variable coresGranted is used to display the number of Cores on the Web Page. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-5522) Accelerate the Histroty Server start
[ https://issues.apache.org/jira/browse/SPARK-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liangliang Gu updated SPARK-5522: - Component/s: Web UI Accelerate the Histroty Server start Key: SPARK-5522 URL: https://issues.apache.org/jira/browse/SPARK-5522 Project: Spark Issue Type: Improvement Components: Spark Core, Web UI Reporter: Liangliang Gu When starting the history server, all the log files will be fetched and parsed in order to get the applications' meta data e.g. App Name, Start Time, Duration, etc. In our production cluster, there exist 2600 log files (160G) in HDFS and it costs 3 hours to restart the history server, which is a little bit too long for us. It would be better, if the history server can show logs with missing information during start-up and fill the missing information after fetching and parsing a log file. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org