abel values are
+-+
|label|
+-+
| 0.0|
| 1.0|
+-+
How come it throws *java.lang.IllegalArgumentException*:*requirement
failed: OneHotEncoderModel expected 2 categorical values for input column
label, but the input column had metadata specifying 3 values.'* in
MultilayerPerceptronClassifie
ing n values.'*
Using LogisticRegression, RandomForestClassifier or LinearRegression works
fine for the same data and OneHotEncoderEstimator.
Any insight on how to resolve this?
Regards,
Mina
Hi,
I have a question for you.
Do we need to kill a spark job every time we change and deploy it to
cluster? Or, is there a way for Spark to automatically pick up the recent
jar version?
Best regards,
Mina
Thank you very much, really appreciate the information.
Kindest regards,
Mina
On Sat, Sep 29, 2018 at 9:42 PM Peyman Mohajerian
wrote:
> Here's a blog on Flint:
> https://databricks.com/blog/2018/09/11/introducing-flint-a-time-series-library-for-apache-spark.html
> I don
/seasonality)
- AR Model/MA Model/Combined Model (e.g. ARMA, ARIMA)
- ACF (Autocorrelation Function)/PACF (Partial Autocorrelation Function)
- Recurrent Neural Network (LSTM: Long Short Term Memory)
Kindest regards,
Mina
On Wed, Sep 19, 2018 at 12:55 PM Jörn Franke wrote:
> What functional
Hi,
I saw spark-ts <https://github.com/sryza/spark-timeseries>, however, looks
like it's not under active development any more. I really appreciate to get
your insight.
Kindest regards,
Mina
On Wed, Sep 19, 2018 at 12:01 PM Mina Aslani wrote:
> Hi,
> I have a question for yo
Hi,
I have a question for you. Do we have any Time-Series Forecasting library
in Spark?
Best regards,
Mina
object has no attribute
'getSeed'/'getTol'/'getMaxIter'.
Your insight is appreciated.
Best regards,
Mina
te for the best model that I can get is the layers.
Any idea?
Best regards,
Mina
Hi,
Is partial fitting/self-training available for a classifier (e.g.
Regression) in Apache Spark?
Best regards,
Mina
was given Vectors with non-matching sizes
I know the cause as the new test data does not have the same vector size as
the trained model. However, I would like to know how I can resolve it? What
is the suggestion/workaround?
I really appreciate your quick response.
Best regards,
Mina
:
OneHotEncoderEstimator - java.lang.NoSuchMethodError: org.apache.spark.sql
.Dataset.withColumns
Regards,
Mina
On Tue, May 15, 2018 at 2:37 AM, Nick Pentreath
wrote:
> Multi column support for StringIndexer didn’t make it into Spark 2.3.0
>
> The PR is still in progress I think - should be
t working. Also, oneHotEncoder is deprecated.
I really appreciate your quick response.
Regards,
Mina
Please take a look at the api doc:
https://spark.apache.org/docs/2.3.0/api/java/org/apache/spark/ml/feature/StringIndexer.html
On Mon, May 14, 2018 at 4:30 PM, Mina Aslani wrote:
> Hi,
>
> There is no SetInputCols/SetOutputCols for StringIndexer in Spark java.
> How multiple
Hi,
There is no SetInputCols/SetOutputCols for StringIndexer in Spark java.
How multiple input/output columns can be specified then?
Regards,
Mina
spark.createDataFrame(data, schema);
Any idea? I am using JAVA therefore,I cannot convert my RDD to DF(). T
Therefore, I try to get specific field/values and createDataFrame manually.
Regards,
Mina
Hi,
I am trying to load a ML model from AWS S3 in my spark app running in a
docker container, however I need to pass the AWS credentials.
My questions is, why do I need to pass the credentials in the path?
And what is the workaround?
Best regards,
Mina
?
Your input is appreciated.
Best regards,
Mina
ot save df as csv as it throws.
ava.lang.UnsupportedOperationException: CSV data source does not support
struct,values:array> data
type.
Any idea?
Best regards,
Mina
On Tue, Mar 27, 2018 at 10:51 PM, naresh Goud
wrote:
> In case of storing as parquet file I don’t think it requires header
.save("output")
The above command saves data but it's in parquet format.
How can I read parquet file and convert to csv to observe the data?
When I use
df = spark.read.parquet("1.parquet"), it throws:
ERROR RetryingBlockFetcher: Exception while beginning fetch of 1
outstanding blocks
Your input is appreciated.
Best regards,
Mina
Hi,
I was hoping that there is a casting vector into String method (instead of
writing my UDF), so that it can then be serialized it into csv/text file.
Best regards,
Mina
On Tue, Feb 20, 2018 at 6:52 PM, vermanurag
wrote:
> If your dataframe has columns types like vector then you cannot s
Hi Snehasish,
Unfortunately, none of the solutions worked.
Regards,
Mina
On Tue, Feb 20, 2018 at 5:12 PM, SNEHASISH DUTTA
wrote:
> Hi Mina,
>
> Even text won't work you may try this df.coalesce(1).write.option("h
> eader","true").mode("overwrite"
Hi Snehasish,
Using df.coalesce(1).write.option("header","true").mode("overwrite
").csv("output") throws
java.lang.UnsupportedOperationException: CSV data source does not support
struct<...> data type.
Regards,
Mina
On Tue, Feb 20, 2018
Hi,
I would like to serialize a dataframe with vector values into a text/csv in
pyspark.
Using below line, I can write the dataframe(e.g. df) as parquet, however I
cannot open it in excel/as text.
df.coalesce(1).write.option("header","true").mode("
overwrite").save("output")
Best regards,
Mina
)
Wondering how to save the result of a MLib transformation function(e.g.
oneHotEncoder) which generates vectors into a file.
Best regards,
Mina
.g. default=none).
Wondering what is the cause and how to fix.
Best regards,
Mina
I have not tried rdd.unpersist(), I thought using rdd = null is the same,
is it not?
On Wed, Oct 18, 2017 at 1:07 AM, Imran Rajjad wrote:
> did you try calling rdd.unpersist()
>
> On Wed, Oct 18, 2017 at 10:04 AM, Mina Aslani
> wrote:
>
>> Hi,
>>
>> I get &quo
publishes the result into
kafka. I set my RDD = null after I finish working, so that intermediate
shuffle files are removed quickly.
How can I avoid "No space left on device"?
Best regards,
Mina
Hi I get below error when I try to run a job running in swarm-node.
Can you please let me know what the problem is and how it can be fixed?
Best regards,
Mina
util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
Exception
2.11/book/github/etc?
Regards,
Mina
ot running locally. I tried using master="local[1]"
same problem.
Any idea?
Regards,
Mina
Master and worker processes are running!
On Wed, Mar 8, 2017 at 12:38 AM, ayan guha wrote:
> You need to start Master and worker processes before connecting to them.
>
> On Wed, Mar 8, 2017 at 3:33 PM, Mina Aslani wrote:
>
>> Hi,
>>
>> I am writing a spark Trans
- Why using "local[1]" no exception is thrown and how to setup to read
from kafka in VM?
*- How to stream from Kafka (data in the topic is in json format)?
*
Your input is appreciated!
Best regards,
Mina
Thank you Ankur for the quick response, really appreciate it! Making the
class serializable resolved the exception!
Best regards,
Mina
On Mon, Mar 6, 2017 at 4:20 PM, Ankur Srivastava wrote:
> The fix for this make your class Serializable. The reason being the
> closures you have defi
thrown?
Best regards,
Mina
System.out.println("Creating Spark Configuration");
SparkConf javaConf = new SparkConf();
javaConf.setAppName("My First Spark Java Application");
javaConf.setMaster("PATH to my spark");
System.out.println("Creating Spark Contex
streaming as well, same error exists!
Any idea about cause of the error?
Kindest regards,
Mina
thoughts/experience/insight with me.
Best regards,
Mina
This is what a Radix tree returns
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/indexedrdd-and-radix-tree-how-to-search-indexedRDD-using-all-prefixes-tp25459p25460.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
from an indexedRDD.
Thank you,
Mina
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/indexedrdd-and-radix-tree-how-to-search-indexedRDD-using-all-prefixes-tp25459.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
I would like to store my data in a Berkeley DB in Hadoop and run Spark for
data processing.
Is it possible?
Thanks
Mina
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Berkeley-DB-storage-for-Spark-tp25215.html
Sent from the Apache Spark User List mailing
Hi, thank you for your answer. but i was talking about function reference.
I want to transform an RDD using a function consisting of multiple
transforms.
For example
def transformFunc1(rdd: RDD[Int]): RDD[Int] = {
}
val rdd2 = transformFunc1(rdd1)...
here i am using reference, i think but i am
41 matches
Mail list logo