Re:

2016-11-28 Thread Marco Mistroni
Uhm, this link https://spark.apache.org/docs/latest/streaming-programming-guide.html#dataframe-and-sql-operations seems to indicate you can do it. hth On Mon, Nov 28, 2016 at 9:55 PM, Didac Gil wrote: > Any suggestions for using something like OneHotEncoder and

Re: spark-shell not starting ( in a Kali linux 2 OS)

2016-11-13 Thread Marco Mistroni
Hi not a Linux expert but how did you installed Spark ? as a root user? The error above seems to indicate you dont have permissions to access that directory. If you have full control of the host you can try to do a chmod 777 to the directory where you installed Spark and its subdirs

sbt shenanigans for a Spark-based project

2016-11-13 Thread Marco Mistroni
HI all i have a small Spark-based project which at the moment depends on jar from Spark 1.6.0 The project has few Spark examples plus one which depends on Flume libraries I am attempting to move to Spark 2.0, but i am having issues with my dependencies The stetup below works fine when compiled

Re: Usage of mllib api in ml

2016-11-20 Thread Marco Mistroni
o leverage RDD version using ml dataframes ? > > *mllib*: MulticlassMetrics > *ml*: MulticlassClassificationEvaluator > > On Sun, Nov 20, 2016 at 4:52 AM, Marco Mistroni <mmistr...@gmail.com> > wrote: > >> Hi >> you can also have a look at this example, >&

RE: Error in running twitter streaming job

2016-11-20 Thread Marco Mistroni
e and giving me good results. But in Spark master, it is > taking so much time and went on hold. Please help. > > > > Thanks, > > SIvaram > > > > *From:* Marco Mistroni [mailto:mmistr...@gmail.com] > *Sent:* Sunday, November 20, 2016 6:34 PM > *To:* Kappaganthu, Sivaram (E

Re: sbt shenanigans for a Spark-based project

2016-11-15 Thread Marco Mistroni
.sbt and sbt > v.0.13.12. > > -Don > > On Mon, Nov 14, 2016 at 3:11 PM, Marco Mistroni <mmistr...@gmail.com> > wrote: > >> uhm.sorry.. still same issues. this is hte new version >> >> name := "SparkExamples" >> version := "1.0&qu

Re: Usage of mllib api in ml

2016-11-20 Thread Marco Mistroni
Hi you can also have a look at this example, https://github.com/sryza/aas/blob/master/ch04-rdf/src/main/scala/com/cloudera/datascience/rdf/RunRDF.scala#L220 kr marco On Sun, Nov 20, 2016 at 9:09 AM, Yanbo Liang wrote: > You can refer this

Re: sbt shenanigans for a Spark-based project

2016-11-14 Thread Marco Mistroni
he.spark.ml [error] import org.apache.spark.ml.tuning.{ ParamGridBuilder, TrainValidationSplit } [error]^ [error] C:\Users\marco\SparkExamples\src\main\scala\DecisionTreeExampleML.scala:16: object Pipeline is not a member of package org.apache.spark.ml [error] import org.apa

Re: Run spark-shell inside Docker container against remote YARN cluster

2016-10-27 Thread Marco Mistroni
I am running spark inside docker though not connecting to cluster How did u build spark? Which profile did u use? Pls share details and I can try to replicate Kr On 27 Oct 2016 2:30 pm, "ponkin" wrote: Hi, May be someone already had experience to build docker image for

Re: Random Forest hangs without trace of error

2016-12-10 Thread Marco Mistroni
Hi Bring back samples to 1k range to debugor as suggested reduce tree and bins had rdd running on same size data with no issues.or send me some sample code and data and I try it out on my ec2 instance ... Kr On 10 Dec 2016 3:16 am, "Md. Rezaul Karim"

Re: Random Forest hangs without trace of error

2016-12-10 Thread Marco Mistroni
maxBins is 32. > > I will probably need to leave this for a few weeks to focus on more > short-term stuff, but I will write here if I solve it or reproduce it more > consistently. > > Morten > > Den 10. dec. 2016 kl. 17.29 skrev Marco Mistroni <mmistr...@gmail.com>: > > H

Re: unit testing in spark

2016-12-09 Thread Marco Mistroni
Me too as I spent most of my time writing unit/integ tests pls advise on where I can start Kr On 9 Dec 2016 12:15 am, "Miguel Morales" wrote: > I would be interested in contributing. Ive created my own library for > this as well. In my blog post I talk about

Re: Random Forest hangs without trace of error

2016-12-11 Thread Marco Mistroni
> I hope to be able to provide a good repro case in some weeks. If the > problem was in our own code I will also post it in this thread. > > Morten > > Den 10. dec. 2016 kl. 23.25 skrev Marco Mistroni <mmistr...@gmail.com>: > > Hello Morten > ok. > afaik there is a ti

Re: [Spark Core] - Spark dynamoDB integration

2016-12-12 Thread Marco Mistroni
Hi If it can help 1.Check Java docs of when that method was introduced 2. U building a fat jar? Check which libraries have been includedsome other dependencies might have forced an old copy to be included 3. If u. Take code outside spark.does it work successfully? 4. Send short

Running Spark on EMR

2017-01-15 Thread Marco Mistroni
hi all could anyone assist here? i am trying to run spark 2.0.0 on an EMR cluster,but i am having issues connecting to the master node So, below is a snippet of what i am doing sc = SparkSession.builder.master(sparkHost).appName("DataProcess").getOrCreate() sparkHost is passed as input

Re: Debugging a PythonException with no details

2017-01-14 Thread Marco Mistroni
It seems it has to do with UDF..Could u share snippet of code you are running? Kr On 14 Jan 2017 1:40 am, "Nicholas Chammas" wrote: > I’m looking for tips on how to debug a PythonException that’s very sparse > on details. The full exception is below, but the only

Re: backward compatibility

2017-01-10 Thread Marco Mistroni
I think old APIs are still supported but u r advised to migrate I migrated few apps from 1.6 to 2.0 with minimal changes Hth On 10 Jan 2017 4:14 pm, "pradeepbill" wrote: > hi there, I am using spark 1.4 code and now we plan to move to spark 2.0, > and > when I check

Re: java.lang.Exception: Could not compute split, block input-0-1480539568000 not found

2016-11-30 Thread Marco Mistroni
Could you paste reproducible snippet code? Kr On 30 Nov 2016 9:08 pm, "kant kodali" wrote: > I have lot of these exceptions happening > > java.lang.Exception: Could not compute split, block input-0-1480539568000 > not found > > > Any ideas what this could be? >

Re: How to convert a unix timestamp column into date format(yyyy-MM-dd) ?

2016-12-04 Thread Marco Mistroni
Hi In python you can use date time.fromtimestamp(..).strftime('%Y%m%d') Which spark API are you using? Kr On 5 Dec 2016 7:38 am, "Devi P.V" wrote: > Hi all, > > I have a dataframe like following, > > ++---+ >

Re: java.lang.Exception: Could not compute split, block input-0-1480539568000 not found

2016-12-01 Thread Marco Mistroni
gt;>>> JsonObject jsonObj = parser.parse(s).getAsJsonObject(); >>>>> if (jsonObj != null && jsonObj.has("var1") ) { >>>>> JsonObject transactionObject = >>>>> jsonObj.get("var1").getAsJsonObject(); >>&g

Re: Spark Python in Jupyter Notebook

2017-01-05 Thread Marco Mistroni
Hi might be off topic, but databricks has a web application in whicn you can use spark with jupyter. have a look at https://community.cloud.databricks.com kr On Thu, Jan 5, 2017 at 7:53 PM, Jon G wrote: > I don't use MapR but I use pyspark with jupyter, and this MapR

Re: [TorrentBroadcast] Pyspark Application terminated saying "Failed to get broadcast_1_ piece0 of broadcast_1 in Spark 2.0.0"

2016-12-30 Thread Marco Mistroni
>> error comes in some specific scenario as per my observations: >> >> 1. When two parallel spark separate application is initiated from one >> driver (not all the time, sometime) >> 2. If one spark jobs are running for more than expected hour let say 2-3 >>

Re: Re: Re: Spark Streaming prediction

2017-01-03 Thread Marco Mistroni
hours (one value > per minute) should be predicted. > > Thank you in advance. > > Regards, > Daniela > > *Gesendet:* Montag, 02. Januar 2017 um 22:30 Uhr > *Von:* "Marco Mistroni" <mmistr...@gmail.com> > *An:* "Daniela S" <daniela_4...@gmx

Re: [TorrentBroadcast] Pyspark Application terminated saying "Failed to get broadcast_1_ piece0 of broadcast_1 in Spark 2.0.0"

2016-12-30 Thread Marco Mistroni
6 ) > Palash>> I didn't test with Spark 1.6. My app is running now good as I > stopped second app (delayed data loading) since last two days. Even most of > the case both are running well except few times... > > > Sent from Yahoo Mail on Android > <https://overview.m

Re: [TorrentBroadcast] Pyspark Application terminated saying "Failed to get broadcast_1_ piece0 of broadcast_1 in Spark 2.0.0"

2017-01-05 Thread Marco Mistroni
al/spark/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py", > line 933, in __call__ > File "/usr/local/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", > line 63, in deco > File "/usr/local/spark/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py", > line 3

Re: [TorrentBroadcast] Pyspark Application terminated saying "Failed to get broadcast_1_ piece0 of broadcast_1 in Spark 2.0.0"

2016-12-29 Thread Marco Mistroni
Hi Pls try to read a CSV from filesystem instead of hadoop. If you can read it successfully then your hadoop file is the issue and you can start debugging from there. Hth On 29 Dec 2016 6:26 am, "Palash Gupta" wrote: > Hi Apache Spark User team, > > > >

Re: [ML] Converting ml.DenseVector to mllib.Vector

2016-12-31 Thread Marco Mistroni
Hi. you have a DataFrame.. there should be either a way to - convert a DF to a Vector without doing a cast - use a ML library which relies to DataFrames only I can see that your code is still importing libraries from two different 'machine learning ' packages import

Re: [TorrentBroadcast] Pyspark Application terminated saying "Failed to get broadcast_1_ piece0 of broadcast_1 in Spark 2.0.0"

2016-12-29 Thread Marco Mistroni
possible > reasons why failed to broadcast error may come. > > Or if you need more logs I can share. > > Thanks again Spark User Group. > > Best Regards > Palash Gupta > > > > Sent from Yahoo Mail on Android > <https://overview.mail.yahoo.com/mobile/?.src=Android

Re: Spark Streaming prediction

2017-01-02 Thread Marco Mistroni
Hi you might want to have a look at the Regression ML algorithm and integrate it in your SparkStreaming application, i m sure someone on the list has a similar use case shortly, you'd want to process all your events and feed it through a ML model which,based on your inputs will predict output

Re: Re: Spark Streaming prediction

2017-01-02 Thread Marco Mistroni
> time on the dashboard (e.g. how does the dashboard know that the value for > minute 300 maps to time 15:05? > > Thank you in advance. > > Best regards, > Daniela > > > > *Gesendet:* Montag, 02. Januar 2017 um 21:07 Uhr > *Von:* "Marco Mistroni" <

Re: Error when loading json to spark

2017-01-01 Thread Marco Mistroni
Hi you will need to pass the schema, like in the snippet below (even though the code might have been superseeded in spark 2.0) import sqlContext.implicits._ val jsonRdd = sc.textFile("file:///c:/tmp/1973-01-11.json") val schema = (new StructType).add("hour",

Re: Spark ML's RandomForestClassifier OOM

2017-01-10 Thread Marco Mistroni
You running locally? Found exactly same issue. 2 solutions: _ reduce datA size. _ run on EMR Hth On 10 Jan 2017 10:07 am, "Julio Antonio Soto" wrote: > Hi, > > I am running into OOM problems while training a Spark ML > RandomForestClassifier (maxDepth of 30, 32 maxBins, 100

Re: Upgrade the scala code using the most updated Spark version

2017-03-28 Thread Marco Mistroni
1.7.5 On 28 Mar 2017 10:10 pm, "Anahita Talebi" <anahita.t.am...@gmail.com> wrote: > Hi, > > Thanks for your answer. > What is the version of "org.slf4j" % "slf4j-api" in your sbt file? > I think the problem might come from this part. > >

Re: Upgrade the scala code using the most updated Spark version

2017-03-28 Thread Marco Mistroni
ergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) => > { > case PathList("javax", "servlet", xs @ _*) => > MergeStrategy.first > case PathList(ps @ _*) if ps.last endsWith ".html" => > MergeStrategy.first > case "application.conf"

Re: Upgrade the scala code using the most updated Spark version

2017-03-28 Thread Marco Mistroni
Hello that looks to me like there's something dodgy withyour Scala installation Though Spark 2.0 is built on Scala 2.11, it still support 2.10... i suggest you change one thing at a time in your sbt First Spark version. run it and see if it works Then amend the scala version hth marco On Tue,

Re:

2017-03-09 Thread Marco Mistroni
Try to remove the Kafka code as it seems Kafka is not the issue. Here. Create a DS and save to Cassandra and see what happensEven in the console That should give u a starting point? Hth On 9 Mar 2017 3:07 am, "sathyanarayanan mudhaliyar" < sathyanarayananmudhali...@gmail.com> wrote:

Re: SPARK Issue in Standalone cluster

2017-08-03 Thread Marco Mistroni
Hello my 2 cents here, hope it helps If you want to just to play around with Spark, i'd leave Hadoop out, it's an unnecessary dependency that you dont need for just running a python script Instead do the following: - got to the root of our master / slave node. create a directory /root/pyscripts -

Re: Spark Testing Library Discussion

2017-04-26 Thread Marco Mistroni
Uh i stayed online in the other link but nobody joinedWill follow transcript Kr On 26 Apr 2017 9:35 am, "Holden Karau" wrote: > And the recording of our discussion is at https://www.youtube.com/ > watch?v=2q0uAldCQ8M > A few of us have follow up things and we will try

Re: SPARK Issue in Standalone cluster

2017-08-05 Thread Marco Mistroni
> > --- > > > if you execute my code then also you will surprisingly see that the writes > in the nodes which is not the master node does not complete moving the > files from the _temporary fol

Re: SPARK Issue in Standalone cluster

2017-08-06 Thread Marco Mistroni
ns, sqlContext) logger.info('Out of here..') ## On Sat, Aug 5, 2017 at 9:09 PM, Marco Mistroni <mmistr...@gmail.com> wrote: > Uh believe me there are lots of ppl on this list who will send u code > snippets if u ask...  > > Yes that is what Steve po

Re: problem initiating spark context with pyspark

2017-06-08 Thread Marco Mistroni
try this link http://letstalkspark.blogspot.co.uk/2016/02/getting-started-with-spark-on-window-64.html it helped me when i had similar problems with windows... hth On Wed, Jun 7, 2017 at 3:46 PM, Curtis Burkhalter < curtisburkhal...@gmail.com> wrote: > Thanks Doc I saw this on another

Re: problem initiating spark context with pyspark

2017-06-10 Thread Marco Mistroni
On Thu, Jun 8, 2017 at 8:38 PM, Marco Mistroni <mmistr...@gmail.com> > wrote: > >> try this link >> >> http://letstalkspark.blogspot.co.uk/2016/02/getting-started- >> with-spark-on-window-64.html >> >> it helped me when i had similar problems with

Re: PLs assist: trying to FlatMap a DataSet / partially OT

2017-09-16 Thread Marco Mistroni
> > This is what you want to do? > > On Fri, Sep 15, 2017 at 4:21 AM, Marco Mistroni <mmistr...@gmail.com> > wrote: > >> HI all >> could anyone assist pls? >> i am trying to flatMap a DataSet[(String, String)] and i am getting >> errors in Eclipse >

PLs assist: trying to FlatMap a DataSet / partially OT

2017-09-14 Thread Marco Mistroni
HI all could anyone assist pls? i am trying to flatMap a DataSet[(String, String)] and i am getting errors in Eclipse the errors are more Scala related than spark -related, but i was wondering if someone came across a similar situation here's what i got. A DS of (String, String) , out of which i

RE: Spark 2.2.0 Win 7 64 bits Exception while deleting Spark temp dir

2017-10-04 Thread Marco Mistroni
Hi Got similar issues on win 10. It has to do imho with the way permissions are setup in windows. That should not prevent ur program from getting back a result.. Kr On Oct 3, 2017 9:42 PM, "JG Perrin" wrote: > do you have a little more to share with us? > > > > maybe

Re: Database insert happening two times

2017-10-17 Thread Marco Mistroni
Hi Uh if the problem is really with parallel exec u can try to call repartition(1) before u save Alternatively try to store data in a csv file and see if u have same behaviour, to exclude dynamodb issues Also ..are the multiple rows being written dupes (they have all same fields/values)? Hth On

Re: [Meetup] Apache Spark and Ignite for IoT scenarious

2017-09-07 Thread Marco Mistroni
Hi Will there be a podcast to view afterwards for remote EMEA users? Kr On Sep 7, 2017 12:15 AM, "Denis Magda" wrote: > Folks, > > Those who are craving for mind food this weekend come over the meetup - > Santa Clara, Sept 9, 9.30 AM: >

Re: NullPointerException error while saving Scala Dataframe to HBase

2017-10-01 Thread Marco Mistroni
Hi The question is getting to the list. I have no experience in hbase ...though , having seen similar stuff when saving a df somewhere else...it might have to do with the properties you need to set to let spark know it is dealing with hbase? Don't u need to set some properties on the spark

Re: Quick one... AWS SDK version?

2017-10-07 Thread Marco Mistroni
Hi JG out of curiosity what's ur usecase? are you writing to S3? you could use Spark to do that , e.g using hadoop package org.apache.hadoop:hadoop-aws:2.7.1 ..that will download the aws client which is in line with hadoop 2.7.1? hth marco On Fri, Oct 6, 2017 at 10:58 PM, Jonathan Kelly

Re: Please Help with DecisionTree/FeatureIndexer

2017-12-16 Thread Marco Mistroni
dexer = new VectorIndexer() > .setInputCol("features") <-- Here specify the "features" column to > index. > .setOutputCol("indexedFeatures") > > > Thanks. > > > On Sat, Dec 16, 2017 at 6:26 AM, Marco Mistroni <mmistr...@gmail.co

Please Help with DecisionTree/FeatureIndexer

2017-12-15 Thread Marco Mistroni
HI all i am trying to run a sample decision tree, following examples here (for Mllib) https://spark.apache.org/docs/latest/ml-classification-regression.html#decision-tree-classifier the example seems to use a Vectorindexer, however i am missing something. How does the featureIndexer knows

How to control logging in testing package com.holdenkarau.spark.testing.

2017-12-13 Thread Marco Mistroni
HI all could anyone advise on how to control logging in com,holdenkarau.spark.testing? there are loads of spark logging statement every time i run a test I tried to disable spark logging using statements below, but with no success import org.apache.log4j.Logger import

Re: pyspark configuration with Juyter

2017-11-04 Thread Marco Mistroni
Hi probably not what u r looking for but if u get stuck with conda jupyther and spark, if u get an account @ community.cloudera you will enjoy jupyther and spark out of the box Gd luck and hth Kr On Nov 4, 2017 4:59 PM, "makoto" wrote: > I setup environment variables in

Re: PySpark 2.1 Not instantiating properly

2017-10-20 Thread Marco Mistroni
://stackoverflow. >> com/questions/34196302/the-root-scratch-dir-tmp-hive-on-hdfs >> -should-be-writable-current-permissions >> >> >> >> On 21 October 2017 at 03:16, Marco Mistroni <mmistr...@gmail.com> wrote: >> >>> Did u build spar

Re: RV: Unintelligible warning arose out of the blue.

2018-05-04 Thread Marco Mistroni
Hi i think it has to do with spark configuration, dont think the standard configuration is geared up to be running in local mode on windows your dataframe is ok, you can check out that you have read it successfully by printing out df.count() and you will see your code is reading the dataframe

Re: Error submitting Spark Job in yarn-cluster mode on EMR

2018-05-08 Thread Marco Mistroni
Did you by any chances left a sparkSession.setMaster("local") lurking in your code? Last time i checked, to run on yarn you have to package a 'fat jar'. could you make sure the spark depedencies in your jar matches the version you are running on Yarn? alternatively please share code including

Re: Dataframe vs dataset

2018-04-28 Thread Marco Mistroni
Imho .neither..I see datasets as typed df and therefore ds are enhanced df Feel free to disagree.. Kr On Sat, Apr 28, 2018, 2:24 PM Michael Artz wrote: > Hi, > > I use Spark everyday and I have a good grip on the basics of Spark, so > this question isnt for myself. But

Re: Problem in persisting file in S3 using Spark: xxx file does not exist Exception

2018-05-02 Thread Marco Mistroni
gt; messages if you don't have the correct permissions. > > On Tue, Apr 24, 2018, 2:28 PM Marco Mistroni <mmistr...@gmail.com> wrote: > >> HI all >> i am using the following code for persisting data into S3 (aws keys are >> already stored in the environment variab

Re: A naive ML question

2018-04-29 Thread Marco Mistroni
Maybe not necessarily what you want but you could, based on trans attributes, find out initial state and end state and give it to a decision tree to figure out if you if based on these attributes you can oreditc tinal stage Again, not what you asked but an idea to use ml for your data? Kr On Sun,

Re: Write to HDFS

2017-10-20 Thread Marco Mistroni
Hi Could you just create an rdd/df out of what you want to save and store it in hdfs? Hth On Oct 20, 2017 9:44 AM, "Uğur Sopaoğlu" wrote: > Hi all, > > In word count example, > > val textFile = sc.textFile("Sample.txt") > val counts = textFile.flatMap(line =>

Re: Write to HDFS

2017-10-20 Thread Marco Mistroni
.map(word => (word, 1)) .reduceByKey(_ + _) It save the results into more than one partition like part-0, part-1. I want to collect all of them into one file. 2017-10-20 16:43 GMT+03:00 Marco Mistroni <mmistr...@gmail.com>: > Hi > Could

Re: PySpark 2.1 Not instantiating properly

2017-10-20 Thread Marco Mistroni
Did u build spark or download the zip? I remember having similar issue...either you have to give write perm to your /tmp directory or there's a spark config you need to override This error is not 2.1 specific...let me get home and check my configs I think I amended my /tmp permissions via

Re: Best active groups, forums or contacts for Spark ?

2018-01-26 Thread Marco Mistroni
Hi From personal experienceand I might be asking u obvious question 1. Does it work in standalone (no cluster) 2. Can u break down app in pieces and try to see at which step the code gets killed? 3. Have u had a look at spark gui to see if we executors go oom? I might be oversimplifying what

Re: good materiala to learn apache spark

2018-01-18 Thread Marco Mistroni
Jacek lawskowski on this mail list wrote a book which is available online. Hth On Jan 18, 2018 6:16 AM, "Manuel Sopena Ballesteros" < manuel...@garvan.org.au> wrote: > Dear Spark community, > > > > I would like to learn more about apache spark. I have a Horton works HDP > platform and have

Re: How to debug Spark job

2018-09-08 Thread Marco Mistroni
Hi Might sound like a dumb advice. But try to break apart your process. Sounds like you Are doing ETL start basic with just ET. and do the changes that results in issues If no problem add the load step Enable spark logging so that you can post error message to the list I think you can have a look

Reading multiple files in Spark / which pattern to use

2018-07-12 Thread Marco Mistroni
hi all i have mutliple files stored in S3 in the following pattern -MM-DD-securities.txt I want to read multiple files at the same time.. I am attempting to use this pattern, for example 2016-01*securities.txt,2016-02*securities.txt,2016-03*securities.txt But it does not seem to work

Re: spark-shell gets stuck in ACCEPTED state forever when ran in YARN client mode.

2018-07-08 Thread Marco Mistroni
You running on emr? You checked the emr logs? Was in similar situation where job was stuck in accepted and then it died..turned out to be an issue w. My code when running g with huge data.perhaps try to reduce gradually the load til it works and then start from there? Not a huge help but I

Re: Live Stream Code Reviews :)

2018-04-12 Thread Marco Mistroni
PST I believelike last time Works out 9pm bst & 10 pm cet if I m correct On Thu, Apr 12, 2018, 8:47 PM Matteo Olivi wrote: > Hi, > 11 am in which timezone? > > Il gio 12 apr 2018, 21:23 Holden Karau ha scritto: > >> Hi Y'all, >> >> If your

Problem in persisting file in S3 using Spark: xxx file does not exist Exception

2018-04-24 Thread Marco Mistroni
HI all i am using the following code for persisting data into S3 (aws keys are already stored in the environment variables) dataFrame.coalesce(1).write.format("com.databricks.spark.csv").save(fileName) However, i keep on receiving an exception that the file does not exist here's what comes

Re: testing frameworks

2019-02-03 Thread Marco Mistroni
Hi sorry to resurrect this thread Any spark libraries for testing code in pyspark? the github code above seems related to Scala following links in the original threads (and also LMGFY) i found out pytest-spark · PyPI w/kindest regards Marco On Tue,

Re: testing frameworks

2019-02-04 Thread Marco Mistroni
Thanks Hichame will follow up on that Anyonen on this list using python version of spark-testing-base? seems theres support for DataFrame thanks in advance and regards Marco On Sun, Feb 3, 2019 at 9:58 PM Hichame El Khalfi wrote: > Hi, > You can use pysparkling =>

<    1   2