from:"Soni spark"

Spark SQL -JDBC connectivity

2016-08-09 Thread Soni spark

Hi,

I would to know the steps to connect SPARK SQL from spring framework
(Web-UI).
also how to run and deploy the web application?

Unable to Run Spark Streaming Job in Hadoop YARN mode

2016-03-30 Thread Soni spark

Hi All,

I am unable to run Spark Streaming job in my Hadoop Cluster, its behaving
unexpectedly. When i submit a job, it fails by throwing some socket
exception in HDFS, if i run the same job second or third time, it runs for
sometime and stops.

I am confused. Is there any configuration in YARN-Site.xml file specific to
spark ???

Please suggest me.

Issues facing while Running Spark Streaming Job in YARN cluster mode

2016-03-22 Thread Soni spark

Hi ,

I am able to run spark streaming job in local mode, when i try to run the
same job in my YARN cluster, its throwing errors.

Any help is appreciated in this regard

Here are my Exception logs:

Exception 1:

java.net.SocketTimeoutException: 48 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/172.16.28.192:50010
remote=/172.16.28.193:46147]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:559)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:728)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:496)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
at java.lang.Thread.run(Thread.java:745)


Exception 2:


2016-03-22 12:17:47,838 WARN org.apache.hadoop.hdfs.BlockReaderFactory: I/O
error constructing remote block reader.
java.nio.channels.ClosedByInterruptException
at
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:658)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at
org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3101)
at
org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:755)
at
org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670)
at
org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:337)
at
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576)
at
org.apache.hadoop.hdfs.DFSInputStream.seekToBlockSource(DFSInputStream.java:1460)
at
org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:773)
at
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:806)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:84)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:265)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2016-03-22 12:17:47,838 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1458629096860_0001_01_01 transitioned from KILLING
to DONE
2016-03-22 12:17:47,841 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Removing container_1458629096860_0001_01_01 from application
application_1458629096860_0001
2016-03-22 12:17:47,842 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got
event CONTAINER_STOP for appId application_1458629096860_0001
2016-03-22 12:17:47,842 WARN org.apache.hadoop.hdfs.DFSClient: Failed to
connect to /node1:50010 for block, add to deadNodes and continue.
java.nio.channels.ClosedByInterruptException
java.nio.channels.ClosedByInterruptException

How to Catch Spark Streaming Twitter Exception ( Written Java)

2016-03-14 Thread Soni spark

Dear All,

I am facing problem with Spark Twitter Streaming code, When ever twitter4j
throws exception, i am unable to catch that exception. Could anyone help me
catching that exception.

Here is Pseudo Code:
SparkConf sparkConf = new
SparkConf().setMaster("local[2]").setAppName("Test");
//SparkConf sparkConf = new
SparkConf().setMaster("yarn-client").setAppName("Test");

// SparkTwitterStreaming sss = new SparkTwitterStreaming();

final int batchIntervalInSec = 60; // choose long interval to
reduce
// num
// files created

Duration batchInterval = Durations.seconds(batchIntervalInSec);
JavaStreamingContext jssc = new JavaStreamingContext(sparkConf,
batchInterval);

try{
JavaReceiverInputDStream receiverStream = null;
 receiverStream = TwitterUtils.createStream(jssc, a2, params);
JavaDStream tweets = receiverStream.map(new
Function() {


public String call(Status tweet) throws Exception {

String json = "";

Gson gson = new Gson();
 jssc.start();
jssc.awaitTermination();

}



I have no issue with streaming the twitter data, When ever twitter account
had expired, i have to catch this exception and do work around for this
exception.


Here is the exception trace:
INFO  spark.streaming.receiver.BlockGenerator - Started BlockGenerator
INFO  spark.streaming.scheduler.ReceiverTracker - Registered receiver for
stream 0 from 172.16.28.183:34829
INFO  spark.streaming.receiver.ReceiverSupervisorImpl - Starting receiver
INFO  spark.streaming.twitter.TwitterReceiver - Twitter receiver started
INFO  spark.streaming.receiver.ReceiverSupervisorImpl - Called receiver
onStart
INFO  spark.streaming.receiver.ReceiverSupervisorImpl - Waiting for
receiver to be stopped
INFO  twitter4j.TwitterStreamImpl - Establishing connection.
INFO  twitter4j.TwitterStreamImpl - 401:Authentication credentials (
https://dev.twitter.com/pages/auth) were missing or incorrect. Ensure that
you have set valid consumer key/secret, access token/secret, and the system
clock is in sync.
\n\n\nError 401 Unauthorized


HTTP ERROR: 401
Problem accessing '/1.1/statuses/filter.json'. Reason:
Unauthorized



INFO  twitter4j.TwitterStreamImpl - Waiting for 1 milliseconds
WARN  spark.streaming.receiver.ReceiverSupervisorImpl - Restarting receiver
with delay 2000 ms: Error receiving tweets
401:Authentication credentials (https://dev.twitter.com/pages/auth) were
missing or incorrect. Ensure that you have set valid consumer key/secret,
access token/secret, and the system clock is in sync.
\n\n\nError 401 Unauthorized


HTTP ERROR: 401
Problem accessing '/1.1/statuses/filter.json'. Reason:
Unauthorized



Relevant discussions can be found on the Internet at:
http://www.google.co.jp/search?q=944a924a or
http://www.google.co.jp/search?q=24fd66dc
TwitterException{exceptionCode=[944a924a-24fd66dc], statusCode=401,
message=null, code=-1, retryAfter=-1, rateLimitStatus=null, version=3.0.3}
at
twitter4j.internal.http.HttpClientImpl.request(HttpClientImpl.java:177)
at
twitter4j.internal.http.HttpClientWrapper.request(HttpClientWrapper.java:61)
at
twitter4j.internal.http.HttpClientWrapper.post(HttpClientWrapper.java:98)
at
twitter4j.TwitterStreamImpl.getFilterStream(TwitterStreamImpl.java:304)
at twitter4j.TwitterStreamImpl$7.getStream(TwitterStreamImpl.java:292)
at
twitter4j.TwitterStreamImpl$TwitterStreamConsumer.run(TwitterStreamImpl.java:462)


Please let me know if you need some more clarity my question.



Thanks,
Sony.

Terminate Spark job in eclipse

2016-03-14 Thread Soni spark

Hi Friends,

Anyone can help me about how to terminate the Spark job in eclipse using
java code?


Thanks
Soniya

Spark Twitter streaming

2016-03-07 Thread Soni spark

Hallo friends,

I need a urgent help.

I am using spark streaming to get the tweets from twitter and loading the
data into HDFS. I want to find out the tweet source whether it is from web
or mobile web or facebook ..etc.  could you please help me logic.

Thanks
Soniya

Re: spark job submisson on yarn-cluster mode failing

2016-01-21 Thread Soni spark

Hi,

I am facing below error msg now. please help me.

2016-01-21 16:06:14,123 WARN org.apache.hadoop.hdfs.DFSClient: Failed to
connect to /xxx.xx.xx.xx:50010 for block, add to deadNodes and continue.
java.nio.channels.ClosedByInterruptException
java.nio.channels.ClosedByInterruptException
at
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:658)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at
org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3101)
at
org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:755)
at
org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670)
at
org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:337)
at
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576)
at
org.apache.hadoop.hdfs.DFSInputStream.seekToBlockSource(DFSInputStream.java:1460)
at
org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:773)
at
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:806)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:84)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:265)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Thanks
Soniya

On Thu, Jan 21, 2016 at 5:42 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> Please also check AppMaster log.
>
> Thanks
>
> On Jan 21, 2016, at 3:51 AM, Akhil Das <ak...@sigmoidanalytics.com> wrote:
>
> Can you look in the executor logs and see why the sparkcontext is being
> shutdown? Similar discussion happened here previously.
> http://apache-spark-user-list.1001560.n3.nabble.com/RECEIVED-SIGNAL-15-SIGTERM-td23668.html
>
> Thanks
> Best Regards
>
> On Thu, Jan 21, 2016 at 5:11 PM, Soni spark <soni2015.sp...@gmail.com>
> wrote:
>
>> Hi Friends,
>>
>> I spark job is successfully running on local mode but failing on cluster 
>> mode. Below is the error message i am getting. anyone can help me.
>>
>>
>>
>> 16/01/21 16:38:07 INFO twitter4j.TwitterStreamImpl: Establishing connection.
>> 16/01/21 16:38:07 INFO twitter.TwitterReceiver: Twitter receiver started
>> 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Called receiver 
>> onStart
>> 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Waiting for receiver 
>> to be stopped*16/01/21 16:38:10 ERROR yarn.ApplicationMaster: RECEIVED 
>> SIGNAL 15: SIGTERM*
>> 16/01/21 16:38:10 INFO streaming.StreamingContext: Invoking 
>> stop(stopGracefully=false) from shutdown hook
>> 16/01/21 16:38:10 INFO scheduler.ReceiverTracker: Sent stop signal to all 1 
>> receivers
>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Received stop signal
>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopping receiver 
>> with message: Stopped by driver:
>> 16/01/21 16:38:10 INFO twitter.TwitterReceiver: Twitter receiver stopped
>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Called receiver 
>> onStop
>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Deregistering 
>> receiver 0*16/01/21 16:38:10 ERROR scheduler.ReceiverTracker: Deregistered 
>> receiver for stream 0: Stopped by driver*
>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopped receiver 0
>> 16/01/21 16:38:10 INFO receiver.BlockGenerator: Stopping BlockGenerator
>> 16/01/21 16:38:10 INFO yarn.ApplicationMaster: Waiting for spark context 
>> initialization ...
>>
>> Thanks
>>
>> Soniya
>>
>>
>

spark job submisson on yarn-cluster mode failing

2016-01-21 Thread Soni spark

Hi Friends,

I spark job is successfully running on local mode but failing on
cluster mode. Below is the error message i am getting. anyone can help
me.



16/01/21 16:38:07 INFO twitter4j.TwitterStreamImpl: Establishing connection.
16/01/21 16:38:07 INFO twitter.TwitterReceiver: Twitter receiver started
16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Called receiver onStart
16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Waiting for
receiver to be stopped*16/01/21 16:38:10 ERROR yarn.ApplicationMaster:
RECEIVED SIGNAL 15: SIGTERM*
16/01/21 16:38:10 INFO streaming.StreamingContext: Invoking
stop(stopGracefully=false) from shutdown hook
16/01/21 16:38:10 INFO scheduler.ReceiverTracker: Sent stop signal to
all 1 receivers
16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Received stop signal
16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopping
receiver with message: Stopped by driver:
16/01/21 16:38:10 INFO twitter.TwitterReceiver: Twitter receiver stopped
16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Called receiver onStop
16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Deregistering
receiver 0*16/01/21 16:38:10 ERROR scheduler.ReceiverTracker:
Deregistered receiver for stream 0: Stopped by driver*
16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopped receiver 0
16/01/21 16:38:10 INFO receiver.BlockGenerator: Stopping BlockGenerator
16/01/21 16:38:10 INFO yarn.ApplicationMaster: Waiting for spark
context initialization ...

Thanks

Soniya

Unable to create hive table using HiveContext

2015-12-23 Thread Soni spark

Hi friends,

I am trying to create hive table through spark with Java code in Eclipse
using below code.

HiveContext sqlContext = new org.apache.spark.sql.hive.HiveContext(sc.sc());
   sqlContext.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)");


but i am getting error

RROR XBM0J: Directory /home/workspace4/Test/metastore_db already exists.

I am not sure why metastore creating in workspace. Please help me.

Thanks
Soniya

create hive table in Spark with Java code

2015-12-20 Thread Soni spark

Hi Friends,

I have created a hive external table with partition. I want to alter the
hive table  partition through spark with java code.

 alter table table1
 add if not exists
 partition(datetime='2015-12-01')
 location 'hdfs://localhost:54310/spark/twitter/datetime=2015-12-01/'

The above query i am executing it manually. i want to execute it through
Spark with Java code. Please help me


Thanks
Soniya

epoch date time problem to load data into in spark

2015-12-08 Thread Soni spark

Hi Friends,

I am written a spark streaming program in Java to access twitter tweets and
it is working fine. I can able to copy the twitter feeds to HDFS location
by batch wise.For  each batch, it is creating a folder with epoch time
stamp. for example,

 If i give HDFS location as *hdfs://localhost:54310/twitter/*, the files
are creating like below


*/spark/twitter/-144958080//spark/twitter/-144957984/*

I want to create a folder name like -MM-dd-HH format instead of by
default epoch format.

I want it like below so that i can do hive partitions easily to access the
data.

*/spark/twitter/2015-12-08-01/*


Any one can help me. Thank you so much in advance.


Thanks
Soniya

Spark twitter streaming in Java

2015-11-18 Thread Soni spark

Dear Friends,

I am struggling with spark twitter streaming. I am not getting any data.
Please correct below code if you found any mistakes.

import org.apache.spark.*;
import org.apache.spark.api.java.
function.*;
import org.apache.spark.streaming.*;
import org.apache.spark.streaming.api.java.*;
import org.apache.spark.streaming.twitter.*;
import twitter4j.GeoLocation;
import twitter4j.Status;
import java.util.Arrays;
import scala.Tuple2;

public class SparkTwitterStreaming {

public static void main(String[] args) {

final String consumerKey = "XXX";
final String consumerSecret = "XX";
final String accessToken = "XX";
final String accessTokenSecret = "XXX";
SparkConf conf = new
SparkConf().setMaster("local[2]").setAppName("SparkTwitterStreaming");
JavaStreamingContext jssc = new JavaStreamingContext(conf, new
Duration(6));
System.setProperty("twitter4j.oauth.consumerKey", consumerKey);
System.setProperty("twitter4j.oauth.consumerSecret", consumerSecret);
System.setProperty("twitter4j.oauth.accessToken", accessToken);
System.setProperty("twitter4j.oauth.accessTokenSecret",
accessTokenSecret);
String[] filters = new String[] {"Narendra Modi"};
JavaReceiverInputDStream twitterStream =
TwitterUtils.createStream(jssc,filters);

// Without filter: Output text of all tweets
JavaDStream statuses = twitterStream.map(
new Function() {
public String call(Status status) { return
status.getText(); }
}
);
statuses.print();
statuses.dstream().saveAsTextFiles("/home/apache/tweets", "txt");

  }

}

Spark SQL -JDBC connectivity

Unable to Run Spark Streaming Job in Hadoop YARN mode

Issues facing while Running Spark Streaming Job in YARN cluster mode

How to Catch Spark Streaming Twitter Exception ( Written Java)

Terminate Spark job in eclipse

Spark Twitter streaming

Re: spark job submisson on yarn-cluster mode failing

spark job submisson on yarn-cluster mode failing

Unable to create hive table using HiveContext

create hive table in Spark with Java code

epoch date time problem to load data into in spark

Spark twitter streaming in Java

12 matches

Site Navigation

Mail list logo

Footer information