[jira] [Created] (SPARK-11182) HDFS Delegation Token will be expired when calling "UserGroupInformation.getCurrentUser.addCredentials" in HA mode

2015-10-19 Thread Liangliang Gu (JIRA)
Liangliang Gu created SPARK-11182:
-

 Summary: HDFS Delegation Token will be expired when calling 
"UserGroupInformation.getCurrentUser.addCredentials" in HA mode
 Key: SPARK-11182
 URL: https://issues.apache.org/jira/browse/SPARK-11182
 Project: Spark
  Issue Type: Bug
  Components: YARN
Reporter: Liangliang Gu


In HA mode, DFSClient will generate HDFS Delegation Token for each Name Node 
automatically, which will not be updated when Spark update Credentials for the 
current user.
Spark should update these tokens in order to avoid Token Expired Error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-11182) HDFS Delegation Token will be expired when calling "UserGroupInformation.getCurrentUser.addCredentials" in HA mode

2015-10-19 Thread Liangliang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-11182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963091#comment-14963091
 ] 

Liangliang Gu commented on SPARK-11182:
---

https://github.com/apache/spark/pull/9168


> HDFS Delegation Token will be expired when calling 
> "UserGroupInformation.getCurrentUser.addCredentials" in HA mode
> --
>
> Key: SPARK-11182
> URL: https://issues.apache.org/jira/browse/SPARK-11182
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Reporter: Liangliang Gu
>
> In HA mode, DFSClient will generate HDFS Delegation Token for each Name Node 
> automatically, which will not be updated when Spark update Credentials for 
> the current user.
> Spark should update these tokens in order to avoid Token Expired Error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-8030) Spelling Mistake: 'fetchHcfsFile' should be 'fetchHdfsFile'

2015-06-02 Thread Liangliang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liangliang Gu closed SPARK-8030.

Resolution: Invalid

 actually hcfs is intended. If you read the comments for that method you'll see 
that it refers to hadoop compatible file system

 Spelling Mistake: 'fetchHcfsFile' should be 'fetchHdfsFile'
 ---

 Key: SPARK-8030
 URL: https://issues.apache.org/jira/browse/SPARK-8030
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Liangliang Gu

 Spelling Mistake in org.apache.spark.util.Utils: 'fetchHcfsFile' should be 
 'fetchHdfsFile'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8030) Spelling Mistake: 'fetchHcfsFile' should be 'fetchHdfsFile'

2015-06-01 Thread Liangliang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14568578#comment-14568578
 ] 

Liangliang Gu commented on SPARK-8030:
--

https://github.com/apache/spark/pull/6575

 Spelling Mistake: 'fetchHcfsFile' should be 'fetchHdfsFile'
 ---

 Key: SPARK-8030
 URL: https://issues.apache.org/jira/browse/SPARK-8030
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Liangliang Gu

 Spelling Mistake in org.apache.spark.util.Utils: 'fetchHcfsFile' should be 
 'fetchHdfsFile'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-8030) Spelling Mistake: 'fetchHcfsFile' should be 'fetchHdfsFile'

2015-06-01 Thread Liangliang Gu (JIRA)
Liangliang Gu created SPARK-8030:


 Summary: Spelling Mistake: 'fetchHcfsFile' should be 
'fetchHdfsFile'
 Key: SPARK-8030
 URL: https://issues.apache.org/jira/browse/SPARK-8030
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Liangliang Gu


Spelling Mistake in org.apache.spark.util.Utils: 'fetchHcfsFile' should be 
'fetchHdfsFile'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-6491) Spark will put the current working dir to the CLASSPATH

2015-03-24 Thread Liangliang Gu (JIRA)
Liangliang Gu created SPARK-6491:


 Summary: Spark will put the current working dir to the CLASSPATH
 Key: SPARK-6491
 URL: https://issues.apache.org/jira/browse/SPARK-6491
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Liangliang Gu


When running bin/computer-classpath.sh, the output will be:

:/spark/conf:/spark/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.5.0-cdh5.2.0.jar:/spark/lib_managed/jars/datanucleus-rdbms-3.2.9.jar:/spark/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar:/spark/lib_managed/jars/datanucleus-core-3.2.10.jar

Java will add the current working dir to the CLASSPATH, if the first : 
exists, which is not expected by spark users.

For example, if I call spark-shell in the folder /root. And there exists a 
core-site.xml under /root/. Spark will use this file as HADOOP CONF file, 
even if I have already set HADOOP_CONF_DIR=/etc/hadoop/conf.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-6427) spark-sql does not throw error if running in yarn-cluster mode

2015-03-20 Thread Liangliang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liangliang Gu closed SPARK-6427.

Resolution: Not a Problem

 spark-sql does not throw error if running in yarn-cluster mode
 --

 Key: SPARK-6427
 URL: https://issues.apache.org/jira/browse/SPARK-6427
 Project: Spark
  Issue Type: Bug
Reporter: Liangliang Gu

 Running spark-sql in yarn-cluster mode will not throw error.
 While running spark-shell in yarn-cluster mode, will get the following error:
 Error: Cluster deploy mode is not applicable to Spark shells.
 Run with --help for usage help or --verbose for debug output



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-6427) spark-sql does not throw error if running in yarn-cluster mode

2015-03-20 Thread Liangliang Gu (JIRA)
Liangliang Gu created SPARK-6427:


 Summary: spark-sql does not throw error if running in yarn-cluster 
mode
 Key: SPARK-6427
 URL: https://issues.apache.org/jira/browse/SPARK-6427
 Project: Spark
  Issue Type: Bug
Reporter: Liangliang Gu


Running spark-sql in yarn-cluster mode will not throw error.

While running spark-shell in yarn-cluster mode, will get the following error:

Error: Cluster deploy mode is not applicable to Spark shells.
Run with --help for usage help or --verbose for debug output




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-6420) Driver's Block Manager does not use spark.driver.host in Yarn-Client mode

2015-03-19 Thread Liangliang Gu (JIRA)
Liangliang Gu created SPARK-6420:


 Summary: Driver's Block Manager does not use spark.driver.host 
in Yarn-Client mode
 Key: SPARK-6420
 URL: https://issues.apache.org/jira/browse/SPARK-6420
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Reporter: Liangliang Gu


In my cluster, the yarn node does not know the client's host name.
So I set spark.driver.host to the ip address of the client.
But the driver's Block Manager does not use spark.driver.host but the 
hostname in Yarn-Client mode.

I got the following error:

 TaskSetManager: Lost task 1.1 in stage 0.0 (TID 2, hadoop-node1538098): 
java.io.IOException: Failed to connect to example-hostname
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:191)
at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156)
at 
org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:78)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher.access$200(RetryingBlockFetcher.java:43)
at 
org.apache.spark.network.shuffle.RetryingBlockFetcher$1.run(RetryingBlockFetcher.java:170)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.channels.UnresolvedAddressException
at sun.nio.ch.Net.checkAddress(Net.java:127)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:644)
at 
io.netty.channel.socket.nio.NioSocketChannel.doConnect(NioSocketChannel.java:193)
at 
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:200)
at 
io.netty.channel.DefaultChannelPipeline$HeadContext.connect(DefaultChannelPipeline.java:1029)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:496)
at 
io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:481)
at 
io.netty.channel.ChannelOutboundHandlerAdapter.connect(ChannelOutboundHandlerAdapter.java:47)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:496)
at 
io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:481)
at 
io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:463)
at 
io.netty.channel.DefaultChannelPipeline.connect(DefaultChannelPipeline.java:849)
at io.netty.channel.AbstractChannel.connect(AbstractChannel.java:199)
at io.netty.bootstrap.Bootstrap$2.run(Bootstrap.java:165)
at 
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
... 1 more




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5771) Number of Cores in Completed Applications of Standalone Master Web Page always be 0 if sc.stop() is called

2015-02-12 Thread Liangliang Gu (JIRA)
Liangliang Gu created SPARK-5771:


 Summary: Number of Cores in Completed Applications of Standalone 
Master Web Page always be 0 if sc.stop() is called
 Key: SPARK-5771
 URL: https://issues.apache.org/jira/browse/SPARK-5771
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Reporter: Liangliang Gu


In Standalone mode, the number of cores in Completed Applications of the Master 
Web Page will always be zero, if sc.stop() is called.

But the number will always be right, if sc.stop() is not called.

The reason maybe: 
after sc.stop() is called, the function removeExecutor of class ApplicationInfo 
will be called, thus reduce the variable coresGranted to zero.  The variable 
coresGranted is used to display the number of Cores on the Web Page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-5522) Accelerate the Histroty Server start

2015-02-11 Thread Liangliang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liangliang Gu updated SPARK-5522:
-
Component/s: Web UI

 Accelerate the Histroty Server start
 

 Key: SPARK-5522
 URL: https://issues.apache.org/jira/browse/SPARK-5522
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core, Web UI
Reporter: Liangliang Gu

 When starting the history server, all the log files will be fetched and 
 parsed in order to get the applications' meta data e.g. App Name, Start Time, 
 Duration, etc. In our production cluster, there exist 2600 log files (160G) 
 in HDFS and it costs 3 hours to restart the history server, which is a little 
 bit too long for us.
 It would be better, if the history server can show logs with missing 
 information during start-up and fill the missing information after fetching 
 and parsing a log file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org