howto install? just clone by git clone
https://github.com/apache/spark/pull/216 the code and than sbt package?
is it the same as https://github.com/LIDIAgroup/SparkFeatureSelection ???
or something different
filip
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabb
i guess i found it
https://github.com/LIDIAgroup/SparkFeatureSelection
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/New-features-Discretization-for-v1-x-in-xiangrui-pdf-tp13256p13261.html
Sent from the Apache Spark User List mailing list archive at Nabbl
is there any news about Discretization in spark?
is there anything on git? i didnt find yet
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/New-features-Discretization-for-v1-x-in-xiangrui-pdf-tp13256.html
Sent from the Apache Spark User List mailing list a
i guess it is not a question of spark but a question on your dataset you need
to Setup
think about what you wonna model and how you can shape the data in such a
way spark can use it
akima is a technique i know
a_{t+1} = C1 * a_{t} + C2* a_{t-1} + ... + C6 * a_{t-5}
spark can finde the cofficien
hi Folks
is there a function in spark like "numpy digitize" with discretize a
numerical variable
or even better
is there a way to use the functionality of the decission tree builder of
spark mllib which splits data into bins in such a way that the splitted
variable mostly predict the target valu
ok I see :-)
.. instead of ~ works fine so
do you know the reason
sbt "run [options]" works
after sbt package
but
spark-submit --class "ClassName" --master local[2]
target/scala/JarPackage.jar [options]
doesnt?
it cannot resolve everything somehow
--
View this message in context:
htt
compilation works but execution not at least with spark-submit as I described
above
when I make a local copy of the training set I can execute sbt "run file"
which works
sbt "run sample_linear_regression_data.txt"
when I do
sbt "run ~/git/spark/data/mllib/sample_linear_regression_data.txt"
the
i try to get use to "sbt" in order to build stnd allone application by myself
the example "SimpleApp" i managed to run
than i tried to copy some example scala program like "LinearRegression" in a
local directory
.
./build.sbt
./src
./src/main
./src/main/scala
./src/main/scala/LinearRegression.sc
hey guys
i still try to get used to compile and run the example code
why does the run_example code submit the class with an
org.apache.spark.examples in front of the class itself?
probably a stupid question but i would be glad some one of you explains
by the way.. how was the "spark...example..
got it when I read the class refference
https://spark.apache.org/docs/0.9.1/api/core/index.html#org.apache.spark.SparkConf
conf.setMaster("local[2]")
set the master to local with 2 threads
but still get some warnings and the result (see below) is also not right i
think
ps: by the way ... first
hi guys,
can someone explain or give the stupid user like me a link where i can get
the right usage of sbt and spark in order to run the examples as a stand
alone app
I got to the point running the app by sbt "run path-to-the-data" but still
get some error because i probably didnt tell the app th
i am wondering if i can use spark in order to search for interesting
featrures/attributes for modelling. In fact I just come from some
introductional sites about vowpal wabbit. i some how like the idea of out of
the core modelling.
well, i have transactional data where customers purchased products
@villu: thank you for your help. In prommis I gonna try it! thats cools :-)
do you know also the other way around from pmml to a model object in spark?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/pmml-with-augustus-tp7313p7473.html
Sent from the Apache S
well I guess your problem is quite unbalanced and due to the information
value as a splitting criterion I guess the algo stops after very view splits
work arround is oversampling
build many training datasets like
take randomly 50% of the positives and from the negativ the same amount or
let say
@Paco: I understand that most promising for me to put effort in understanding
for in deploying models in the spark enviroment would be augustus and
zementis right?
actually as you mention I would have both direction of deploying. I have
already models which I could transform into pmml and I also t
Thank you very much
the cascading project i didn't recognize it at all till now
this project is very interesting
also I got the idea of the usage of scala as a language for spark - becuase
i can intergrate jvm based libraries very easy/naturaly when I got it right
mh... but I could also use spa
hello guys,
has anybody experiances with the library augustus as a serializer for
scoring models?
looks very promising and i even found a hint on the connection augustus and
spark
all the best
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/pmml-with-augu
am I right when I just use cPickle for seriailzation a model (see code below)
or didnt I get it with PickleSerializer (from pyspark.serializers import
PickleSerializer)
...
model = LogisticRegressionWithSGD.train(parsedData)
mm = open("mm.txt","wb")
import cPickle
cPickle.dump(model,mm)
mm.clo
18 matches
Mail list logo