Re: [SparkSQL] SparkSQL performance on small TPCDS tables is very low when compared to Drill or Presto

2018-03-31 Thread Tin Vu
Hi Gaurav, Thank you for your response. This is the answer for your questions: 1. Spark 2.3.0 2. I was using 'spark-sql' command, for example: 'spark-sql --master spark:/*:7077 --database tpcds_bin_partitioned_orc_100 -f $file_name' wih file_name is the file that contains SQL script ("select *

Re: [SparkSQL] SparkSQL performance on small TPCDS tables is very low when compared to Drill or Presto

2018-03-31 Thread Gourav Sengupta
Hi Tin, This sounds interesting. While I would prefer to think that Presto and Drill have can you please provide the following details: 1. SPARK version 2. The exact code used in SPARK (the full code that was used) 3. HADOOP version I do think that SPARK and DRILL have complementary and

In spark streaming application how to distinguish between normal and abnormal termination of application?

2018-03-31 Thread Igor Makhlin
Hi All, I'm looking for a way to distinguish between normal and abnormal termination of a spark streaming application with (checkpointing enabled). Adding application listener doesn't really help because onApplicationEnd event has no information regarding the cause of the termination.