one hot encoding

2014-12-12 Thread Lochana Menikarachchi
Do we have one-hot encoding in spark MLLib 1.1.1 or 1.2.0 ? It wasn't 
available in 1.1.0.

Thanks.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: packaging spark run time with osgi service

2014-12-04 Thread Lochana Menikarachchi
I think the problem has to do with akka not picking up the 
reference.conf file in the assembly.jar


We managed to make akka pick the conf file by temporary switching the 
class loaders.


Thread.currentThread().setContextClassLoader(JavaSparkContext.class.getClassLoader());

The model gets build but execution fails during some later stage with a snappy 
error..

14/12/04 08:07:44 ERROR Executor: Exception in task 0.0 in stage 105.0 (TID 104)
java.lang.UnsatisfiedLinkError: 
org.xerial.snappy.SnappyNative.maxCompressedLength(I)I
at org.xerial.snappy.SnappyNative.maxCompressedLength(Native Method)
at org.xerial.snappy.Snappy.maxCompressedLength(Snappy.java:320)
at 
org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:79)
at 
org.apache.spark.io.SnappyCompressionCodec.compressedOutputStream(CompressionCodec.scala:125)

According to akka documentation a conf file can be parsed with -Dconfig.file= 
but, we couldn't get it to work..

Any ideas how to do this?

Lochana




On 12/2/14 8:17 AM, Dinesh J. Weerakkody wrote:

Hi Lochana,

can you please go through this mail thread [1]. I haven't tried but 
can be useful.


[1] 
http://apache-spark-user-list.1001560.n3.nabble.com/Packaging-a-spark-job-using-maven-td5615.html 



On Mon, Dec 1, 2014 at 4:28 PM, Lochana Menikarachchi 
locha...@gmail.com mailto:locha...@gmail.com wrote:


I have spark core and mllib as dependencies for a spark based osgi
service. When I call the model building method through a unit test
(without osgi) it works OK. When I call it through the osgi
service, nothing happens. I tried adding spark assembly jar. Now
it throws following error..

An error occurred while building supervised machine learning
model: No configuration setting found for key 'akka.version'
com.typesafe.config.ConfigException$Missing: No configuration
setting found for key 'akka.version'
at
com.typesafe.config.impl.SimpleConfig.findKey(SimpleConfig.java:115)
at
com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:136)
at
com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:142)
at
com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:150)
at
com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:155)
at
com.typesafe.config.impl.SimpleConfig.getString(SimpleConfig.java:197)

What is the correct way to include spark runtime dependencies to
osgi service.. Thanks.

Lochana

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
mailto:dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org
mailto:dev-h...@spark.apache.org




--
Thanks  Best Regards,

*Dinesh J. Weerakkody*
/www.dineshjweerakkody.com http://www.dineshjweerakkody.com/




spark osgi class loading issue

2014-12-04 Thread Lochana Menikarachchi
We are trying to call spark through an osgi service (with osgifyed 
version of assembly.jar). Spark does not work (due to the way spark 
reads akka reference.conf) unless we switch the class loader as follows.


Thread.currentThread().setContextClassLoader(JavaSparkContext.class.getClassLoader());

The problem is there is no way to switch between class loaders and get the 
information generated by spark operations.
Is there a way to run spark through an osgi service. I think if we can parse 
the reference.conf some other way this might work. Can somebody shed some light 
on this..

Thanks.

Lochana





Re: packaging spark run time with osgi service

2014-12-01 Thread Lochana Menikarachchi

Already tried the solutions they provided.. Did not workout..
On 12/2/14 8:17 AM, Dinesh J. Weerakkody wrote:

Hi Lochana,

can you please go through this mail thread [1]. I haven't tried but 
can be useful.


[1] 
http://apache-spark-user-list.1001560.n3.nabble.com/Packaging-a-spark-job-using-maven-td5615.html 



On Mon, Dec 1, 2014 at 4:28 PM, Lochana Menikarachchi 
locha...@gmail.com mailto:locha...@gmail.com wrote:


I have spark core and mllib as dependencies for a spark based osgi
service. When I call the model building method through a unit test
(without osgi) it works OK. When I call it through the osgi
service, nothing happens. I tried adding spark assembly jar. Now
it throws following error..

An error occurred while building supervised machine learning
model: No configuration setting found for key 'akka.version'
com.typesafe.config.ConfigException$Missing: No configuration
setting found for key 'akka.version'
at
com.typesafe.config.impl.SimpleConfig.findKey(SimpleConfig.java:115)
at
com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:136)
at
com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:142)
at
com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:150)
at
com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:155)
at
com.typesafe.config.impl.SimpleConfig.getString(SimpleConfig.java:197)

What is the correct way to include spark runtime dependencies to
osgi service.. Thanks.

Lochana

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
mailto:dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org
mailto:dev-h...@spark.apache.org




--
Thanks  Best Regards,

*Dinesh J. Weerakkody*
/www.dineshjweerakkody.com http://www.dineshjweerakkody.com/




label points with a given index

2014-10-23 Thread Lochana Menikarachchi


SparkConf conf = new 
SparkConf().setAppName(LogisticRegression).setMaster(local[4]);

JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDDString lines = sc.textFile(some.csv);
JavaRDDLabeledPoint lPoints = lines.map(new CSVLineParser());

Is there anyway to parse an index to a function.. for example instead of 
hard coding (parts[0]) below is there a way to parse this




public class CSVLineParser implements FunctionString, LabeledPoint {
private static final Pattern COMMA = Pattern.compile(,);

@Override
public LabeledPoint call(String line) {
String[] parts = COMMA.split(line);
double y = Double.parseDouble(parts[0]);
double[] x = new double[parts.length];
for (int i = 1; i  parts.length; ++i) {
x[i] = Double.parseDouble(parts[i]);
}
return new LabeledPoint(y, Vectors.dense(x));
}
}

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: label points with a given index

2014-10-23 Thread Lochana Menikarachchi

Figured constructor can be used for this purpose..
On 10/24/14 7:57 AM, Lochana Menikarachchi wrote:


SparkConf conf = new 
SparkConf().setAppName(LogisticRegression).setMaster(local[4]);

JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDDString lines = sc.textFile(some.csv);
JavaRDDLabeledPoint lPoints = lines.map(new CSVLineParser());

Is there anyway to parse an index to a function.. for example instead 
of hard coding (parts[0]) below is there a way to parse this




public class CSVLineParser implements FunctionString, LabeledPoint {
private static final Pattern COMMA = Pattern.compile(,);

@Override
public LabeledPoint call(String line) {
String[] parts = COMMA.split(line);
double y = Double.parseDouble(parts[0]);
double[] x = new double[parts.length];
for (int i = 1; i  parts.length; ++i) {
x[i] = Double.parseDouble(parts[i]);
}
return new LabeledPoint(y, Vectors.dense(x));
}
}



-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Hyper Parameter Tuning Algorithms

2014-10-05 Thread Lochana Menikarachchi

Found this thread from April..

http://mail-archives.apache.org/mod_mbox/spark-user/201404.mbox/%3ccabjxkq6b7sfaxie4+aqtcmd8jsqbznsxsfw6v5o0wwwouob...@mail.gmail.com%3E

Wondering what the status of this.. We are thinking about implementing 
these algorithms.. Would be a waste if they are already available?


Please advice.

Thanks.

Lochana

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Hyper Parameter Optimization Algorithms

2014-09-29 Thread Lochana Menikarachchi

Hi,

Is there anyone who works on hyper parameter optimization algorithms? If 
not, is there any interest on the subject. We are thinking about 
implementing some of these algorithms and contributing to spark? thoughts?


Lochana

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org