[
https://issues.apache.org/jira/browse/HIVE-13002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15174721#comment-15174721
]
Sergey Shelukhin commented on HIVE-13002:
-----------------------------------------
It looks like spark tests get stuck because in some cases, 0 splits are
generated
{noformat}
2016-03-01T09:24:19,709 INFO [stderr-redir-1[]]: client.SparkClientImpl
(SparkClientImpl.java:run(593)) - 16/03/01 09:24:19 INFO SparkPlan:
2016-03-01T09:24:19,710 INFO [stderr-redir-1[]]: client.SparkClientImpl
(SparkClientImpl.java:run(593)) - !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Spark
Plan !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2016-03-01T09:24:19,710 INFO [stderr-redir-1[]]: client.SparkClientImpl
(SparkClientImpl.java:run(593)) -
2016-03-01T09:24:19,710 INFO [stderr-redir-1[]]: client.SparkClientImpl
(SparkClientImpl.java:run(593)) - MapTran 1 <-- ( MapInput 2 (cache off) )
2016-03-01T09:24:19,710 INFO [stderr-redir-1[]]: client.SparkClientImpl
(SparkClientImpl.java:run(593)) -
2016-03-01T09:24:19,710 INFO [stderr-redir-1[]]: client.SparkClientImpl
(SparkClientImpl.java:run(593)) - !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Spark
Plan !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2016-03-01T09:24:19,711 INFO [stderr-redir-1[]]: client.SparkClientImpl
(SparkClientImpl.java:run(593)) - 16/03/01 09:24:19 INFO Utilities: PLAN PATH =
file:/home/hiveptest/54.158.228.44-hiveptest-0/apache-github-source-source/itests/qtest-spark/target/tmp/scratchdir/hiveptest/894f65b8-0cb1-495f-9fa9-4948b8e482e1/hive_2016-03-01_09-24-18_701_9015622114754920872-1/-mr-10005/550ff857-09c8-4e04-b4be-cb6084110e22/map.xml
2016-03-01T09:24:19,712 INFO [stderr-redir-1[]]: client.SparkClientImpl
(SparkClientImpl.java:run(593)) - 16/03/01 09:24:19 INFO
SerializationUtilities: Deserializing MapWork using kryo
2016-03-01T09:24:19,714 INFO [stderr-redir-1[]]: client.SparkClientImpl
(SparkClientImpl.java:run(593)) - 16/03/01 09:24:19 INFO
CombineHiveInputFormat: Total number of paths: 1, launching 1 threads to check
non-combinable ones.
2016-03-01T09:24:19,717 INFO [stderr-redir-1[]]: client.SparkClientImpl
(SparkClientImpl.java:run(593)) - 16/03/01 09:24:19 INFO
CombineHiveInputFormat: CombineHiveInputSplit creating pool for
file:/home/hiveptest/54.158.228.44-hiveptest-0/apache-github-source-source/itests/qtest-spark/target/tmp/scratchdir/hiveptest/894f65b8-0cb1-495f-9fa9-4948b8e482e1/hive_2016-03-01_09-24-18_701_9015622114754920872-1/-mr-10005/0;
using filter path
file:/home/hiveptest/54.158.228.44-hiveptest-0/apache-github-source-source/itests/qtest-spark/target/tmp/scratchdir/hiveptest/894f65b8-0cb1-495f-9fa9-4948b8e482e1/hive_2016-03-01_09-24-18_701_9015622114754920872-1/-mr-10005/0
2016-03-01T09:24:19,722 INFO [stderr-redir-1[]]: client.SparkClientImpl
(SparkClientImpl.java:run(593)) - 16/03/01 09:24:19 INFO FileInputFormat: Total
input paths to process : 1
2016-03-01T09:24:19,723 INFO [stderr-redir-1[]]: client.SparkClientImpl
(SparkClientImpl.java:run(593)) - 16/03/01 09:24:19 INFO
CombineHiveInputFormat: number of splits 0
2016-03-01T09:24:19,723 INFO [stderr-redir-1[]]: client.SparkClientImpl
(SparkClientImpl.java:run(593)) - 16/03/01 09:24:19 INFO
CombineHiveInputFormat: Number of all splits 0
2016-03-01T09:24:19,725 DEBUG [RPC-Handler-3[]]: rpc.KryoMessageCodec
(KryoMessageCodec.java:decode(98)) - Decoded message of type
org.apache.hive.spark.client.rpc.Rpc$MessageHeader (6 bytes)
2016-03-01T09:24:19,725 DEBUG [RPC-Handler-3[]]: rpc.KryoMessageCodec
(KryoMessageCodec.java:decode(98)) - Decoded message of type
org.apache.hive.spark.client.BaseProtocol$JobSubmitted (95 bytes)
2016-03-01T09:24:19,725 DEBUG [RPC-Handler-3[]]: rpc.RpcDispatcher
(RpcDispatcher.java:channelRead0(74)) - [ClientProtocol] Received RPC message:
type=CALL id=189 payload=org.apache.hive.spark.client.BaseProtocol$JobSubmitted
2016-03-01T09:24:19,725 INFO [RPC-Handler-3[]]: client.SparkClientImpl
(SparkClientImpl.java:handle(571)) - Received spark job ID: 36 for
5ec4a71b-de5c-4f01-bf41-dc67c4875e2e
{noformat}
Then it hangs forever reporting that the job is started, until it times out.
There's no logging at all after 09:24:15 in Spark logs, so I am assuming it's
some issue with Hive-on-Spark/HoS test when no splits are generated.
Looking at why this could be the case.
> Hive object is not thread safe, is shared via a threadlocal and thus should
> not be passed around too much - part 1
> ------------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-13002
> URL: https://issues.apache.org/jira/browse/HIVE-13002
> Project: Hive
> Issue Type: Bug
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Attachments: HIVE-13002.01.patch, HIVE-13002.02.patch,
> HIVE-13002.03.patch, HIVE-13002.patch
>
>
> Discovered in some q test run:
> {noformat}
> TestCliDriver.testCliDriver_insert_values_orig_table:123->runTest:199
> Unexpected exception java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextEntry(HashMap.java:926)
> at java.util.HashMap$EntryIterator.next(HashMap.java:966)
> at java.util.HashMap$EntryIterator.next(HashMap.java:964)
> at
> org.apache.hadoop.hive.ql.metadata.Hive.dumpAndClearMetaCallTiming(Hive.java:3412)
> at
> org.apache.hadoop.hive.ql.Driver.dumpMetaCallTimingWithoutEx(Driver.java:574)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1722)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1342)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1113)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1101)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)