We do have two big tables each includes 5 billion of rows, so my question here
is should we partition /sort the data and convert it to Parquet before doing
any join?
Best Regards ... Amin
Mohebbi PhD candidate in Software Engineering at
Does anyone know how to transpose the columns in Spark -scala ?
This is how I want to unpivot the table :
How to unpivot the table based on the multiple columns
|
|
|
| | |
|
|
|
| |
How to unpivot the table based on the multiple columns
I am using Scala and Spark to unpivot a tabl
... Amin
Mohebbi PhD candidate in Software Engineering at university of Malaysia Tel
: +60 18 2040 017 E-Mail : tp025...@ex.apiit.edu.my
amin_...@me.comd
-jars
/home/sshuser/reactiveinflux-spark_2.10-1.4.0.10.0.5.1.jar sapn_2.11-1.0.jar
Can you help to solve this issue?
Best Regards ....... Amin
Mohebbi PhD candidate in Software Engineering at university of Malaysia Tel
: +60 18 2040 017 E-Mail : tp02
? files system/time series db/azure cosmos / standard
db?2- Is it right way to do to use spark as to etl and aggregation application
, store it somewhere and use power bi for reporting and dashboard purposes?
Best Regards ... Amin
Mohebbi PhD
spark with nosql as I
think combination of these two could help to have random access and run many
queries by different users. 2- do we really need to use a time series db?
Best Regards ....... Amin
Mohebbi PhD candidate in Software Engine
ot;org.apache.spark" %
"spark-core_2.10" % "1.1.1",But there is still an error that says :unresolved
dependency spark-mllib;1.1.1 : not foundAnyone knows how to add dependency of
Mllib in .sbt file?
Best Regards
...
gards
...
Amin Mohebbi
PhD candidate in Software Engineering
at university of Malaysia
Tel : +60 18 2040 017
E-Mail : tp025...@ex.apiit.edu.my
amin_...@me.com
that I do not want to use Mllib and would like to write my own k-means.
Best Regards
...
Amin Mohebbi
PhD candidate in Software Engineering
at university of Malaysia
Tel : +60 18 2040 017
E-Mail : tp025...@ex.apiit.edu.my
ne explain to me how can I do the pre-processing step, before running
the k-means using spark.
Best Regards
...
Amin Mohebbi
PhD candidate in Software Engineering
at university of Malaysia
Tel : +60 18 2040 017
E-Mail : tp025...@ex.apiit.edu.my
amin_...@me.com
no -5] No address associated with hostname
>>> sc.parallelize(range(1000)).count()
Traceback (most recent call last):
File "", line 1, in
NameError: name 'sc' is not defined
>>> sc
Traceback (most recent call last):
File "", line 1, in
NameE
...
Amin Mohebbi
PhD candidate in Software Engineering
at university of Malaysia
Tel : +60 18 2040 017
E-Mail : tp025...@ex.apiit.edu.my
amin_...@me.com
Can anyone explain to me what is difference between kmeans in Mlib and kmeans
in examples/src/main/python/kmeans.py?
Best Regards
...
Amin Mohebbi
PhD candidate in Software Engineering
at university of Malaysia
H/P : +60 18 2040
-submit?
Best Regards
...
Amin Mohebbi
PhD candidate in Software Engineering
at university of Malaysia
H/P : +60 18 2040 017
E-Mail : tp025...@ex.apiit.edu.my
amin_...@me.com
ckoverflow.com/questions/24571922/apache-spark-stderr-and-stdout/24594576#24594576
I am not sure whether I need to set a ip address to driver ? do I need a
separate machine for driver ?
Best Regards
...
Amin Mohebbi
PhD candidate in So
I have the following in spark-env.sh
SPARK_MASTER_IP=master
SPARK_MASTER_port=7077
Best Regards
...
Amin Mohebbi
PhD candidate in Software Engineering
at university of Malaysia
H/P : +60 18 2040 017
E-Mail : tp025
rker@slave2:41483/user/Worker" "app-20140704174955-0002"
14/07/04 17:50:14 ERROR
CoarseGrainedExecutorBackend:
Driver Disassociated [akka.tcp://sparkExecutor@slave2:33758] ->
[akka.tcp://spark@master:54477] disassociated! Shutting down
17 matches
Mail list logo