seems i got one thread spinning 100% for a while now, in
FsHistoryProvider.initialize(). maybe something wrong with my logs on hdfs
that its reading? or could it simply really take 30 mins to read all the
history on dhfs?

jstack:

Deadlock Detection:

No deadlocks found.

Thread 2272: (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information
may be imprecise)
 - java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, long)
@bci=20, line=226 (Compiled frame)
 -
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(java.util.concurrent.SynchronousQueue$TransferStack$SNode,
boolean, long) @bci=174, line=460 (Compiled frame)
 -
java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.lang.Object,
boolean, long) @bci=102, line=359 (Interpreted frame)
 - java.util.concurrent.SynchronousQueue.poll(long,
java.util.concurrent.TimeUnit) @bci=11, line=942 (Interpreted frame)
 - java.util.concurrent.ThreadPoolExecutor.getTask() @bci=141, line=1068
(Interpreted frame)
 -
java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
@bci=26, line=1130 (Interpreted frame)
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615
(Interpreted frame)
 - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)


Thread 1986: (state = BLOCKED)
 - java.lang.Thread.sleep(long) @bci=0 (Interpreted frame)
 - org.apache.hadoop.hdfs.PeerCache.run() @bci=41, line=250 (Interpreted
frame)
 -
org.apache.hadoop.hdfs.PeerCache.access$000(org.apache.hadoop.hdfs.PeerCache)
@bci=1, line=41 (Interpreted frame)
 - org.apache.hadoop.hdfs.PeerCache$1.run() @bci=4, line=119 (Interpreted
frame)
 - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)


Thread 1970: (state = BLOCKED)


Thread 1969: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
 - java.lang.ref.ReferenceQueue.remove(long) @bci=44, line=135 (Interpreted
frame)
 - java.lang.ref.ReferenceQueue.remove() @bci=2, line=151 (Interpreted
frame)
 - java.lang.ref.Finalizer$FinalizerThread.run() @bci=36, line=209
(Interpreted frame)


Thread 1968: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
 - java.lang.Object.wait() @bci=2, line=503 (Interpreted frame)
 - java.lang.ref.Reference$ReferenceHandler.run() @bci=46, line=133
(Interpreted frame)


Thread 1958: (state = IN_VM)
 - java.lang.Throwable.fillInStackTrace(int) @bci=0 (Compiled frame;
information may be imprecise)
 - java.lang.Throwable.fillInStackTrace() @bci=16, line=783 (Compiled frame)
 - java.lang.Throwable.<init>(java.lang.String, java.lang.Throwable)
@bci=24, line=287 (Compiled frame)
 - java.lang.Exception.<init>(java.lang.String, java.lang.Throwable)
@bci=3, line=84 (Compiled frame)
 - org.json4s.package$MappingException.<init>(java.lang.String,
java.lang.Exception) @bci=13, line=56 (Compiled frame)
 - org.json4s.reflect.package$.fail(java.lang.String, java.lang.Exception)
@bci=6, line=96 (Compiled frame)
 - org.json4s.Extraction$.convert(org.json4s.JsonAST$JValue,
org.json4s.reflect.ScalaType, org.json4s.Formats, scala.Option) @bci=2447,
line=554 (Compiled frame)
 - org.json4s.Extraction$.extract(org.json4s.JsonAST$JValue,
org.json4s.reflect.ScalaType, org.json4s.Formats) @bci=796, line=331
(Compiled frame)
 - org.json4s.Extraction$.extract(org.json4s.JsonAST$JValue,
org.json4s.Formats, scala.reflect.Manifest) @bci=10, line=42 (Compiled
frame)
 - org.json4s.Extraction$.extractOpt(org.json4s.JsonAST$JValue,
org.json4s.Formats, scala.reflect.Manifest) @bci=7, line=54 (Compiled frame)
 - org.json4s.ExtractableJsonAstNode.extractOpt(org.json4s.Formats,
scala.reflect.Manifest) @bci=9, line=40 (Compiled frame)
 -
org.apache.spark.util.JsonProtocol$.shuffleWriteMetricsFromJson(org.json4s.JsonAST$JValue)
@bci=116, line=702 (Compiled frame)
 -
org.apache.spark.util.JsonProtocol$$anonfun$taskMetricsFromJson$2.apply(org.json4s.JsonAST$JValue)
@bci=4, line=670 (Compiled frame)
 -
org.apache.spark.util.JsonProtocol$$anonfun$taskMetricsFromJson$2.apply(java.lang.Object)
@bci=5, line=670 (Compiled frame)
 - scala.Option.map(scala.Function1) @bci=22, line=145 (Compiled frame)
 -
org.apache.spark.util.JsonProtocol$.taskMetricsFromJson(org.json4s.JsonAST$JValue)
@bci=414, line=670 (Compiled frame)
 -
org.apache.spark.util.JsonProtocol$.taskEndFromJson(org.json4s.JsonAST$JValue)
@bci=174, line=508 (Compiled frame)
 -
org.apache.spark.util.JsonProtocol$.sparkEventFromJson(org.json4s.JsonAST$JValue)
@bci=389, line=464 (Compiled frame)
 -
org.apache.spark.scheduler.ReplayListenerBus$$anonfun$replay$1.apply(java.lang.String)
@bci=34, line=51 (Compiled frame)
 -
org.apache.spark.scheduler.ReplayListenerBus$$anonfun$replay$1.apply(java.lang.Object)
@bci=5, line=49 (Compiled frame)
 - scala.collection.Iterator$class.foreach(scala.collection.Iterator,
scala.Function1) @bci=16, line=743 (Compiled frame)
 - scala.collection.AbstractIterator.foreach(scala.Function1) @bci=2,
line=1177 (Interpreted frame)
 - org.apache.spark.scheduler.ReplayListenerBus.replay(java.io.InputStream,
java.lang.String) @bci=42, line=49 (Interpreted frame)
 - 
org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$replay(org.apache.hadoop.fs.FileStatus,
org.apache.spark.scheduler.ReplayListenerBus) @bci=69, line=260
(Interpreted frame)
 -
org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$6.apply(org.apache.hadoop.fs.FileStatus)
@bci=19, line=190 (Interpreted frame)
 -
org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$6.apply(java.lang.Object)
@bci=5, line=188 (Interpreted frame)
 -
scala.collection.TraversableLike$$anonfun$flatMap$1.apply(java.lang.Object)
@bci=9, line=252 (Interpreted frame)
 -
scala.collection.TraversableLike$$anonfun$flatMap$1.apply(java.lang.Object)
@bci=2, line=252 (Interpreted frame)
 -
scala.collection.IndexedSeqOptimized$class.foreach(scala.collection.IndexedSeqOptimized,
scala.Function1) @bci=22, line=33 (Interpreted frame)
 - scala.collection.mutable.WrappedArray.foreach(scala.Function1) @bci=2,
line=35 (Interpreted frame)
 -
scala.collection.TraversableLike$class.flatMap(scala.collection.TraversableLike,
scala.Function1, scala.collection.generic.CanBuildFrom) @bci=17, line=252
(Interpreted frame)
 - scala.collection.AbstractTraversable.flatMap(scala.Function1,
scala.collection.generic.CanBuildFrom) @bci=3, line=104 (Interpreted frame)
 - org.apache.spark.deploy.history.FsHistoryProvider.checkForLogs()
@bci=110, line=188 (Interpreted frame)
 - org.apache.spark.deploy.history.FsHistoryProvider.initialize() @bci=38,
line=116 (Interpreted frame)
 -
org.apache.spark.deploy.history.FsHistoryProvider.<init>(org.apache.spark.SparkConf)
@bci=214, line=99 (Interpreted frame)
 -
sun.reflect.NativeConstructorAccessorImpl.newInstance0(java.lang.reflect.Constructor,
java.lang.Object[]) @bci=0 (Interpreted frame)
 -
sun.reflect.NativeConstructorAccessorImpl.newInstance(java.lang.Object[])
@bci=72, line=57 (Interpreted frame)
 -
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(java.lang.Object[])
@bci=5, line=45 (Interpreted frame)
 - java.lang.reflect.Constructor.newInstance(java.lang.Object[]) @bci=79,
line=526 (Interpreted frame)
 - org.apache.spark.deploy.history.HistoryServer$.main(java.lang.String[])
@bci=89, line=185 (Interpreted frame)
 - org.apache.spark.deploy.history.HistoryServer.main(java.lang.String[])
@bci=4 (Interpreted frame)



On Thu, May 7, 2015 at 2:17 PM, Koert Kuipers <ko...@tresata.com> wrote:

> good idea i will take a look. it does seem to be spinning one cpu at
> 100%...
>
> On Thu, May 7, 2015 at 2:03 PM, Marcelo Vanzin <van...@cloudera.com>
> wrote:
>
>> Can you get a jstack for the process? Maybe it's stuck somewhere.
>>
>> On Thu, May 7, 2015 at 11:00 AM, Koert Kuipers <ko...@tresata.com> wrote:
>>
>>> i am trying to launch the spark 1.3.1 history server on a secure cluster.
>>>
>>> i can see in the logs that it successfully logs into kerberos, and it is
>>> replaying all the logs, but i never see the log message that indicate the
>>> web server is started (i should see something like "Successfully started
>>> service on port 18080." or "Started HistoryServer at
>>> http://somehost:18080";). yet the daemon stays alive...
>>>
>>> any idea why the history server would never start the web service?
>>>
>>> thanks!
>>>
>>>
>>
>>
>> --
>> Marcelo
>>
>
>

Reply via email to