i get the error of Py4JJavaError: An error occurred while calling o177.showString while running code below

2016-10-25 Thread muhammet pakyürek
i used spark 2.0.1 and work pypsaprk.sql dataframe


lower = arguments["lower"]
lower_udf = udf(lambda x: lower if x

sql.functions partitionby AttributeError: 'NoneType' object has no attribute '_jvm'

2016-10-21 Thread muhammet pakyürek
i work with partitioonby for lead lag functions i get the errror above and here 
is the explanation


jspec = 
sc._jvm.org.apache.spark.sql.expressions.Window.partitionBy(_to_java_cols(cols))





pyspark dataframe codes for lead lag to column

2016-10-20 Thread muhammet pakyürek


is there pyspark dataframe codes for lead lag to column?

lead/lag column is something

1  lag   -1lead 2
213
324
435
54   -1


how to see spark class variable values on variable explorer of spyder for python?

2016-10-19 Thread muhammet pakyürek

is there any way to  to see spark class variable values on variable explorer of 
spyder for python?



tutorial for access elements of dataframe columns and column values of a specific rows?

2016-10-18 Thread muhammet pakyürek






From: muhammet pakyürek <mpa...@hotmail.com>
Sent: Monday, October 17, 2016 11:51 AM
To: user@spark.apache.org
Subject: rdd and dataframe columns dtype


how can i set columns dtype of rdd




rdd and dataframe columns dtype

2016-10-17 Thread muhammet pakyürek
how can i set columns dtype of rdd




how to decide which part of process use spark dataframe and pandas dataframe?

2016-09-26 Thread muhammet pakyürek


is there a clear guide to decide the above?


how to find NaN values of each row of spark dataframe to decide whether the rows is dropeed or not

2016-09-26 Thread muhammet pakyürek

is there any way to do this directly.  if its not, is there any todo this 
indirectly using another datastrcutures of spark



ERROR StandaloneSchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.

2016-09-23 Thread muhammet pakyürek

i tried to connect cassandra via spark-cassandra-conenctor2.0.0 on pyspark but 
i get the error below

  i think it s related to pyspark/context.py but i dont know how?


unresolved dependency: datastax#spark-cassandra-connector;2.0.0-s_2.11-M3-20-g75719df: not found

2016-09-21 Thread muhammet pakyürek
while i run the spark-shell as below

spark-shell --jars 
'/home/ktuser/spark-cassandra-connector/target/scala-2.11/root_2.11-2.0.0-M3-20-g75719df.jar'
 --packages datastax:spark-cassandra-connector:2.0.0-s_2.11-M3-20-g75719df 
--conf spark.cassandra.connection.host=localhost

i get the error
unresolved dependency: 
datastax#spark-cassandra-connector;2.0.0-s_2.11-M3-20-g75719df.


the second question even if i added

libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector" % 
"2.0.0-M3"

to spark-cassandra-connector/sbt/sbt file jar files are 
root_2.11-2.0.0-M3-20-g75719df


teh third question after build of connectpr scala 2.11 how do i integrate it 
with pyspark?



cassandra and spark can be built and worked on the same computer?

2016-09-20 Thread muhammet pakyürek


can we connect to cassandra from spark using spark-cassandra-connector which 
all three are built on the same computer? what kind of problems this 
configuration leads to?


is there any bug for the configuration of spark 2.0 cassandra spark connector 2.0 and cassandra 3.0.8

2016-09-19 Thread muhammet pakyürek


please tell me the configuration including the most recent version of 
cassandra, spark and cassandra spark connector


cassandra.yaml configuration for cassandra spark connection

2016-09-19 Thread muhammet pakyürek
how to configure cassandra.yaml configuration file for datastax

cassandra spark connection




best versions for cassandra spark connection

2016-09-19 Thread muhammet pakyürek
hi


in order to connect pyspark to cassandra which versions of items for conection 
must be installed. i think cassandra 3.7 is not compatible with spark 2.0 and 
datastax pyspark-cassandra connector 2.0, please give me the correct version 
and steps to connect them




cassandra 3.7 is compatible with datastax Spark Cassandra Connector 2.0?

2016-09-19 Thread muhammet pakyürek





true conf for sparkconf().set().setMaster() to connect to cassandra

2016-09-19 Thread muhammet pakyürek





cassandra can not accessed via pyspark or spark-shell but it is accessible using cqlsh. what is the problem.

2016-09-19 Thread muhammet pakyürek


i have tried all possible examples on internet to access cassandra table via 
pypsark or spark shell. however, all of trials resulted in fails related to 
java gateway. what is the main problem?


Failed to open native connection to Cassandra at

2016-09-07 Thread muhammet pakyürek
how to solve this problem below

py4j.protocol.Py4JJavaError: An error occurred while calling o33.load.
: java.io.IOException: Failed to open native connection to Cassandra at 
{127.0.1.1}:9042




clear steps for installation of spark, cassandra and cassandra connector to run on spyder 2.3.7 using python 3.5 and anaconda 2.4 ipython 4.0

2016-09-06 Thread muhammet pakyürek


could u send me  documents and links to satisfy all above requirements of 
installation of spark, cassandra and cassandra connector to run on spyder 2.3.7 
using python 3.5 and anaconda 2.4 ipython 4.0