Re: Spark Task is not created

2016-06-25 Thread Akhil Das
Would be good if you can paste the piece of code that you are executing.

On Sun, Jun 26, 2016 at 11:21 AM, Ravindra 
wrote:

> Hi All,
>
> May be I need to just set some property or its a known issue. My spark
> application hangs in test environment whenever I see following message -
>
> 16/06/26 11:13:34 INFO DAGScheduler: *Submitting 2 missing tasks from
> ShuffleMapStage* 145 (MapPartitionsRDD[590] at rdd at
> WriteDataFramesDecorator.scala:61)
> 16/06/26 11:13:34 INFO TaskSchedulerImpl: Adding task set 145.0 with 2
> tasks
> 16/06/26 11:13:34 INFO TaskSetManager: Starting task 0.0 in stage 145.0
> (TID 186, localhost, PROCESS_LOCAL, 2389 bytes)
> 16/06/26 11:13:34 INFO Executor: Running task 0.0 in stage 145.0 (TID 186)
> 16/06/26 11:13:34 INFO BlockManager: Found block rdd_575_0 locally
> 16/06/26 11:13:34 INFO GenerateMutableProjection: Code generated in 3.796
> ms
> 16/06/26 11:13:34 INFO Executor: Finished task 0.0 in stage 145.0 (TID
> 186). 2578 bytes result sent to driver
> 16/06/26 11:13:34 INFO TaskSetManager: Finished task 0.0 in stage 145.0
> (TID 186) in 24 ms on localhost (1/2)
>
> It happens with any action. The application works fine whenever I notice 
> "*Submitting
> 1 missing tasks from ShuffleMapStage". *For this I need to tweak the plan
> like using repartition, coalesce etc but this also doesn't help always.
>
> Some of the Spark properties are as given below -
>
> NameValue
> spark.app.idlocal-1466914377931
> spark.app.name  SparkTest
> spark.cores.max  3
> spark.default.parallelism 1
> spark.driver.allowMultipleContexts true
> spark.executor.iddriver
> spark.externalBlockStore.folderName
> spark-050049bd-c058-4035-bc3d-2e73a08e8d0c
> spark.masterlocal[2]
> spark.scheduler.mode FIFO
> spark.ui.enabledtrue
>
>
> Thanks,
> Ravi.
>
>


-- 
Cheers!


Spark Task is not created

2016-06-25 Thread Ravindra
Hi All,

May be I need to just set some property or its a known issue. My spark
application hangs in test environment whenever I see following message -

16/06/26 11:13:34 INFO DAGScheduler: *Submitting 2 missing tasks from
ShuffleMapStage* 145 (MapPartitionsRDD[590] at rdd at
WriteDataFramesDecorator.scala:61)
16/06/26 11:13:34 INFO TaskSchedulerImpl: Adding task set 145.0 with 2 tasks
16/06/26 11:13:34 INFO TaskSetManager: Starting task 0.0 in stage 145.0
(TID 186, localhost, PROCESS_LOCAL, 2389 bytes)
16/06/26 11:13:34 INFO Executor: Running task 0.0 in stage 145.0 (TID 186)
16/06/26 11:13:34 INFO BlockManager: Found block rdd_575_0 locally
16/06/26 11:13:34 INFO GenerateMutableProjection: Code generated in 3.796 ms
16/06/26 11:13:34 INFO Executor: Finished task 0.0 in stage 145.0 (TID
186). 2578 bytes result sent to driver
16/06/26 11:13:34 INFO TaskSetManager: Finished task 0.0 in stage 145.0
(TID 186) in 24 ms on localhost (1/2)

It happens with any action. The application works fine whenever I
notice "*Submitting
1 missing tasks from ShuffleMapStage". *For this I need to tweak the plan
like using repartition, coalesce etc but this also doesn't help always.

Some of the Spark properties are as given below -

NameValue
spark.app.idlocal-1466914377931
spark.app.name  SparkTest
spark.cores.max  3
spark.default.parallelism 1
spark.driver.allowMultipleContexts true
spark.executor.iddriver
spark.externalBlockStore.folderName
spark-050049bd-c058-4035-bc3d-2e73a08e8d0c
spark.masterlocal[2]
spark.scheduler.mode FIFO
spark.ui.enabledtrue


Thanks,
Ravi.


Streaming and Batch code sharing

2016-06-25 Thread Nikhil Goyal
Hi,

Does anyone has a good example where realtime and batch are able to share
same code.
(Other than this one
https://github.com/databricks/reference-apps/blob/master/logs_analyzer/chapter1/reuse.md
)

Thanks
Nikhil


Re: Running JavaBased Implementation of StreamingKmeans Spark

2016-06-25 Thread Biplob Biswas
Hi,

I tried doing that but even then I couldn't see any results. I started the
program and added the files later.

Thanks & Regards
Biplob Biswas

On Sat, Jun 25, 2016 at 2:19 AM, Jayant Shekhar 
wrote:

> Hi Biplop,
>
> Can you try adding new files to the training/test directories after you
> have started your streaming application! Especially the test directory as
> you are printing your predictions.
>
> On Fri, Jun 24, 2016 at 2:32 PM, Biplob Biswas 
> wrote:
>
>>
>> Hi,
>>
>> I implemented the streamingKmeans example provided in the spark website
>> but
>> in Java.
>> The full implementation is here,
>>
>> http://pastebin.com/CJQfWNvk
>>
>> But i am not getting anything in the output except occasional timestamps
>> like one below:
>>
>> ---
>> Time: 1466176935000 ms
>> ---
>>
>> Also, i have 2 directories:
>> "D:\spark\streaming example\Data Sets\training"
>> "D:\spark\streaming example\Data Sets\test"
>>
>> and inside these directories i have 1 file each "samplegpsdata_train.txt"
>> and "samplegpsdata_test.txt" with training data having 500 datapoints and
>> test data with 60 datapoints.
>>
>> I am very new to the spark systems and any help is highly appreciated.
>>
>> Thank you so much
>> Biplob Biswas
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Running-JavaBased-Implementation-of-StreamingKmeans-Spark-tp27225.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


Unsubscribe

2016-06-25 Thread Y!-RK


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Unsubscribe

2016-06-25 Thread milad bourhani
Unsubscribe


Re: spark-sql jdbc dataframe mysql data type issue

2016-06-25 Thread Mich Talebzadeh
select 10 sample rows for columns id, ctime from each (MySQL and spark)
tables and post the output please.

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 25 June 2016 at 13:36, 刘虓  wrote:

> Hi,
> I came across this strange behavior of Apache Spark 1.6.1:
> when I was reading mysql table into spark dataframe ,a column of data type
> float got mapped into double.
>
> dataframe schema:
>
> root
>
>  |-- id: long (nullable = true)
>
>  |-- ctime: double (nullable = true)
>
>  |-- atime: double (nullable = true)
>
> mysql schema:
>
> mysql> desc test.user_action_2;
>
> +---+--+--+-+-+---+
>
> | Field | Type | Null | Key | Default | Extra |
>
> +---+--+--+-+-+---+
>
> | id| int(10) unsigned | YES  | | NULL|   |
>
> | ctime | float| YES  | | NULL|   |
>
> | atime | double   | YES  | | NULL|   |
>
> +---+--+--+-+-+---+
> I wonder if anyone have seen this behavior before.
>


spark-sql jdbc dataframe mysql data type issue

2016-06-25 Thread 刘虓
Hi,
I came across this strange behavior of Apache Spark 1.6.1:
when I was reading mysql table into spark dataframe ,a column of data type
float got mapped into double.

dataframe schema:

root

 |-- id: long (nullable = true)

 |-- ctime: double (nullable = true)

 |-- atime: double (nullable = true)

mysql schema:

mysql> desc test.user_action_2;

+---+--+--+-+-+---+

| Field | Type | Null | Key | Default | Extra |

+---+--+--+-+-+---+

| id| int(10) unsigned | YES  | | NULL|   |

| ctime | float| YES  | | NULL|   |

| atime | double   | YES  | | NULL|   |

+---+--+--+-+-+---+
I wonder if anyone have seen this behavior before.


Unsubscribe

2016-06-25 Thread Shivpriya Tamboskar
Unsubscribe

Sent from my iPhone

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org