Re: Spark-ML : Streaming library for Factorization Machine (FM/FFM)

2018-04-16 Thread Maximilien DEFOURNE

Hi,

Unfortunately no. i just used this lib for FM and FFM raw. I thought it 
could be a good baseline for your need.


Regards

Maximilien


On 16/04/18 15:43, Sundeep Kumar Mehta wrote:

Hi Maximilien,

Thanks for your response, Did you convert this repo into DStream for 
continuous/incremental training ?


Regards
Sundeep



-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Spark-ML : Streaming library for Factorization Machine (FM/FFM)

2018-04-16 Thread Sundeep Kumar Mehta
Hi Maximilien,

Thanks for your response, Did you convert this repo into DStream for
continuous/incremental training ?

Regards
Sundeep

On Mon, Apr 16, 2018 at 4:17 PM, Maximilien DEFOURNE <
maximilien.defou...@s4m.io> wrote:

> Hi,
>
> I used this repo for FM/FFM : https://github.com/Intel-
> bigdata/imllib-spark
>
>
> Regards
>
> Maximilien DEFOURNE
>
> On 15/04/18 05:14, Sundeep Kumar Mehta wrote:
>
> Hi All,
>
> Any library/ github project to use factorization machine or field aware
> factorization machine via online learning for continuous training ?
>
> Request you to please share your thoughts on this.
>
> Regards
> Sundeep
>
>
>
>


Re: Spark-ML : Streaming library for Factorization Machine (FM/FFM)

2018-04-16 Thread Maximilien DEFOURNE

Hi,

I used this repo for FM/FFM : https://github.com/Intel-bigdata/imllib-spark


Regards

Maximilien DEFOURNE


On 15/04/18 05:14, Sundeep Kumar Mehta wrote:

Hi All,

Any library/ github project to use factorization machine or field 
aware factorization machine via online learning for continuous training ?


Request you to please share your thoughts on this.

Regards
Sundeep






Spark-ML : Streaming library for Factorization Machine (FM/FFM)

2018-04-14 Thread Sundeep Kumar Mehta
Hi All,

Any library/ github project to use factorization machine or field aware
factorization machine via online learning for continuous training ?

Request you to please share your thoughts on this.

Regards
Sundeep


Re: Spark 2.1 ml library scalability

2017-04-07 Thread Nick Pentreath
It's true that CrossValidator is not parallel currently - see
https://issues.apache.org/jira/browse/SPARK-19357 and feel free to help
review.

On Fri, 7 Apr 2017 at 14:18 Aseem Bansal  wrote:

>
>- Limited the data to 100,000 records.
>- 6 categorical feature which go through imputation, string indexing,
>one hot encoding. The maximum classes for the feature is 100. As data is
>imputated it becomes dense.
>- 1 numerical feature.
>- Training Logistic Regression through CrossValidation with grid to
>optimize its regularization parameter over the values 0.0001, 0.001, 0.005,
>0.01, 0.05, 0.1
>- Using spark's launcher api to launch it on a yarn cluster in Amazon
>AWS.
>
> I was thinking that as CrossValidator is finding the best parameters it
> should be able to run them independently. That sounds like something which
> could be ran in parallel.
>
>
> On Fri, Apr 7, 2017 at 5:20 PM, Nick Pentreath 
> wrote:
>
> What is the size of training data (number examples, number features)?
> Dense or sparse features? How many classes?
>
> What commands are you using to submit your job via spark-submit?
>
> On Fri, 7 Apr 2017 at 13:12 Aseem Bansal  wrote:
>
> When using spark ml's LogisticRegression, RandomForest, CrossValidator
> etc. do we need to give any consideration while coding in making it scale
> with more CPUs or does it scale automatically?
>
> I am reading some data from S3, using a pipeline to train a model. I am
> running the job on a spark cluster with 36 cores and 60GB RAM and I cannot
> see much usage. It is running but I was expecting spark to use all RAM
> available and make it faster. So that's why I was thinking whether we need
> to take something particular in consideration or wrong expectations?
>
>
>


Re: Spark 2.1 ml library scalability

2017-04-07 Thread Aseem Bansal
   - Limited the data to 100,000 records.
   - 6 categorical feature which go through imputation, string indexing,
   one hot encoding. The maximum classes for the feature is 100. As data is
   imputated it becomes dense.
   - 1 numerical feature.
   - Training Logistic Regression through CrossValidation with grid to
   optimize its regularization parameter over the values 0.0001, 0.001, 0.005,
   0.01, 0.05, 0.1
   - Using spark's launcher api to launch it on a yarn cluster in Amazon
   AWS.

I was thinking that as CrossValidator is finding the best parameters it
should be able to run them independently. That sounds like something which
could be ran in parallel.


On Fri, Apr 7, 2017 at 5:20 PM, Nick Pentreath 
wrote:

> What is the size of training data (number examples, number features)?
> Dense or sparse features? How many classes?
>
> What commands are you using to submit your job via spark-submit?
>
> On Fri, 7 Apr 2017 at 13:12 Aseem Bansal  wrote:
>
>> When using spark ml's LogisticRegression, RandomForest, CrossValidator
>> etc. do we need to give any consideration while coding in making it scale
>> with more CPUs or does it scale automatically?
>>
>> I am reading some data from S3, using a pipeline to train a model. I am
>> running the job on a spark cluster with 36 cores and 60GB RAM and I cannot
>> see much usage. It is running but I was expecting spark to use all RAM
>> available and make it faster. So that's why I was thinking whether we need
>> to take something particular in consideration or wrong expectations?
>>
>


Re: Spark 2.1 ml library scalability

2017-04-07 Thread Nick Pentreath
What is the size of training data (number examples, number features)? Dense
or sparse features? How many classes?

What commands are you using to submit your job via spark-submit?

On Fri, 7 Apr 2017 at 13:12 Aseem Bansal  wrote:

> When using spark ml's LogisticRegression, RandomForest, CrossValidator
> etc. do we need to give any consideration while coding in making it scale
> with more CPUs or does it scale automatically?
>
> I am reading some data from S3, using a pipeline to train a model. I am
> running the job on a spark cluster with 36 cores and 60GB RAM and I cannot
> see much usage. It is running but I was expecting spark to use all RAM
> available and make it faster. So that's why I was thinking whether we need
> to take something particular in consideration or wrong expectations?
>


Spark 2.1 ml library scalability

2017-04-07 Thread Aseem Bansal
When using spark ml's LogisticRegression, RandomForest, CrossValidator etc.
do we need to give any consideration while coding in making it scale with
more CPUs or does it scale automatically?

I am reading some data from S3, using a pipeline to train a model. I am
running the job on a spark cluster with 36 cores and 60GB RAM and I cannot
see much usage. It is running but I was expecting spark to use all RAM
available and make it faster. So that's why I was thinking whether we need
to take something particular in consideration or wrong expectations?


Spark SQL Avro Library for 1.2

2015-04-08 Thread roy
How do I build Spark SQL Avro Library for Spark 1.2 ?

I was following this https://github.com/databricks/spark-avro and was able
to build spark-avro_2.10-1.0.0.jar by simply running sbt/sbt package from
the project root.

but we are on Spark 1.2 and need compatible spark-avro jar.

Any idea how do I do it ?

Thanks



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-Avro-Library-for-1-2-tp22421.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Spark as a Library

2014-09-16 Thread Ruebenacker, Oliver A

 Hello,

  Suppose I want to use Spark from an application that I already submit to run 
in another container (e.g. Tomcat). Is this at all possible? Or do I have to 
split the app into two components, and submit one to Spark and one to the other 
container? In that case, what is the preferred way for the two components to 
communicate with each other? Thanks!

 Best, Oliver

Oliver Ruebenacker | Solutions Architect

Altisource(tm)
290 Congress St, 7th Floor | Boston, Massachusetts 02210
P: (617) 728-5582 | ext: 275585
oliver.ruebenac...@altisource.commailto:oliver.ruebenac...@altisource.com | 
www.Altisource.com

***

This email message and any attachments are intended solely for the use of the 
addressee. If you are not the intended recipient, you are prohibited from 
reading, disclosing, reproducing, distributing, disseminating or otherwise 
using this transmission. If you have received this message in error, please 
promptly notify the sender by reply email and immediately delete this message 
from your system. This message and any attachments may contain information that 
is confidential, privileged or exempt from disclosure. Delivery of this message 
to any person other than the intended recipient is not intended to waive any 
right or privilege. Message transmission is not guaranteed to be secure or free 
of software viruses.
***


Re: Spark as a Library

2014-09-16 Thread Matei Zaharia
If you want to run the computation on just one machine (using Spark's local 
mode), it can probably run in a container. Otherwise you can create a 
SparkContext there and connect it to a cluster outside. Note that I haven't 
tried this though, so the security policies of the container might be too 
restrictive. In that case you'd have to run the app outside and expose an RPC 
interface between them.

Matei

On September 16, 2014 at 8:17:08 AM, Ruebenacker, Oliver A 
(oliver.ruebenac...@altisource.com) wrote:

 

 Hello,

 

  Suppose I want to use Spark from an application that I already submit to run 
in another container (e.g. Tomcat). Is this at all possible? Or do I have to 
split the app into two components, and submit one to Spark and one to the other 
container? In that case, what is the preferred way for the two components to 
communicate with each other? Thanks!

 

 Best, Oliver

 

Oliver Ruebenacker | Solutions Architect

 

Altisource™

290 Congress St, 7th Floor | Boston, Massachusetts 02210

P: (617) 728-5582 | ext: 275585

oliver.ruebenac...@altisource.com | www.Altisource.com

 

***
This email message and any attachments are intended solely for the use of the 
addressee. If you are not the intended recipient, you are prohibited from 
reading, disclosing, reproducing, distributing, disseminating or otherwise 
using this transmission. If you have received this message in error, please 
promptly notify the sender by reply email and immediately delete this message 
from your system. This message and any attachments may contain information that 
is confidential, privileged or exempt from disclosure. Delivery of this message 
to any person other than the intended recipient is not intended to waive any 
right or privilege. Message transmission is not guaranteed to be secure or free 
of software viruses.
***

Re: Spark as a Library

2014-09-16 Thread Soumya Simanta
It depends on what you want to do with Spark. The following has worked for
me.
Let the container handle the HTTP request and then talk to Spark using
another HTTP/REST interface. You can use the Spark Job Server for this.

Embedding Spark inside the container is not a great long term solution IMO
because you may see issues when you want to connect with a Spark cluster.



On Tue, Sep 16, 2014 at 11:16 AM, Ruebenacker, Oliver A 
oliver.ruebenac...@altisource.com wrote:



  Hello,



   Suppose I want to use Spark from an application that I already submit to
 run in another container (e.g. Tomcat). Is this at all possible? Or do I
 have to split the app into two components, and submit one to Spark and one
 to the other container? In that case, what is the preferred way for the two
 components to communicate with each other? Thanks!



  Best, Oliver



 Oliver Ruebenacker | Solutions Architect



 Altisource™

 290 Congress St, 7th Floor | Boston, Massachusetts 02210

 P: (617) 728-5582 | ext: 275585

 oliver.ruebenac...@altisource.com | www.Altisource.com




 ***

 This email message and any attachments are intended solely for the use of
 the addressee. If you are not the intended recipient, you are prohibited
 from reading, disclosing, reproducing, distributing, disseminating or
 otherwise using this transmission. If you have received this message in
 error, please promptly notify the sender by reply email and immediately
 delete this message from your system.
 This message and any attachments may contain information that is
 confidential, privileged or exempt from disclosure. Delivery of this
 message to any person other than the intended recipient is not intended to
 waive any right or privilege. Message transmission is not guaranteed to be
 secure or free of software viruses.

 ***



RE: Spark as a Library

2014-09-16 Thread Ruebenacker, Oliver A

 Hello,

  Thanks for the response and great to hear it is possible. But how do I 
connect to Spark without using the submit script?

  I know how to start up a master and some workers and then connect to the 
master by packaging the app that contains the SparkContext and then submitting 
the package with the spark-submit script in standalone-mode. But I don’t want 
to submit the app that contains the SparkContext via the script, because I want 
that app to be running on a web server. So, what are other ways to connect to 
Spark? I can’t find in the docs anything other than using the script. Thanks!

 Best, Oliver

From: Matei Zaharia [mailto:matei.zaha...@gmail.com]
Sent: Tuesday, September 16, 2014 1:31 PM
To: Ruebenacker, Oliver A; user@spark.apache.org
Subject: Re: Spark as a Library

If you want to run the computation on just one machine (using Spark's local 
mode), it can probably run in a container. Otherwise you can create a 
SparkContext there and connect it to a cluster outside. Note that I haven't 
tried this though, so the security policies of the container might be too 
restrictive. In that case you'd have to run the app outside and expose an RPC 
interface between them.

Matei


On September 16, 2014 at 8:17:08 AM, Ruebenacker, Oliver A 
(oliver.ruebenac...@altisource.commailto:oliver.ruebenac...@altisource.com) 
wrote:

 Hello,

  Suppose I want to use Spark from an application that I already submit to run 
in another container (e.g. Tomcat). Is this at all possible? Or do I have to 
split the app into two components, and submit one to Spark and one to the other 
container? In that case, what is the preferred way for the two components to 
communicate with each other? Thanks!

 Best, Oliver

Oliver Ruebenacker | Solutions Architect

Altisource™
290 Congress St, 7th Floor | Boston, Massachusetts 02210
P: (617) 728-5582 | ext: 275585
oliver.ruebenac...@altisource.commailto:oliver.ruebenac...@altisource.com | 
www.Altisource.com

***

This email message and any attachments are intended solely for the use of the 
addressee. If you are not the intended recipient, you are prohibited from 
reading, disclosing, reproducing, distributing, disseminating or otherwise 
using this transmission. If you have received this message in error, please 
promptly notify the sender by reply email and immediately delete this message 
from your system. This message and any attachments may contain information that 
is confidential, privileged or exempt from disclosure. Delivery of this message 
to any person other than the intended recipient is not intended to waive any 
right or privilege. Message transmission is not guaranteed to be secure or free 
of software viruses.
***
***

This email message and any attachments are intended solely for the use of the 
addressee. If you are not the intended recipient, you are prohibited from 
reading, disclosing, reproducing, distributing, disseminating or otherwise 
using this transmission. If you have received this message in error, please 
promptly notify the sender by reply email and immediately delete this message 
from your system. This message and any attachments may contain information that 
is confidential, privileged or exempt from disclosure. Delivery of this message 
to any person other than the intended recipient is not intended to waive any 
right or privilege. Message transmission is not guaranteed to be secure or free 
of software viruses.
***


Re: Spark as a Library

2014-09-16 Thread Daniel Siegmann
You can create a new SparkContext inside your container pointed to your
master. However, for your script to run you must call addJars to put the
code on your workers' classpaths (except when running locally).

Hopefully your webapp has some lib folder which you can point to as a
source for the jars. In the Play Framework you can use
play.api.Play.application.getFile(lib) to get a path to the lib directory
and get the contents. Of course that only works on the packaged web app.

On Tue, Sep 16, 2014 at 3:17 PM, Ruebenacker, Oliver A 
oliver.ruebenac...@altisource.com wrote:



  Hello,



   Thanks for the response and great to hear it is possible. But how do I
 connect to Spark without using the submit script?



   I know how to start up a master and some workers and then connect to the
 master by packaging the app that contains the SparkContext and then
 submitting the package with the spark-submit script in standalone-mode. But
 I don’t want to submit the app that contains the SparkContext via the
 script, because I want that app to be running on a web server. So, what are
 other ways to connect to Spark? I can’t find in the docs anything other
 than using the script. Thanks!



  Best, Oliver



 *From:* Matei Zaharia [mailto:matei.zaha...@gmail.com]
 *Sent:* Tuesday, September 16, 2014 1:31 PM
 *To:* Ruebenacker, Oliver A; user@spark.apache.org
 *Subject:* Re: Spark as a Library



 If you want to run the computation on just one machine (using Spark's
 local mode), it can probably run in a container. Otherwise you can create a
 SparkContext there and connect it to a cluster outside. Note that I haven't
 tried this though, so the security policies of the container might be too
 restrictive. In that case you'd have to run the app outside and expose an
 RPC interface between them.



 Matei



 On September 16, 2014 at 8:17:08 AM, Ruebenacker, Oliver A (
 oliver.ruebenac...@altisource.com) wrote:



  Hello,



   Suppose I want to use Spark from an application that I already submit to
 run in another container (e.g. Tomcat). Is this at all possible? Or do I
 have to split the app into two components, and submit one to Spark and one
 to the other container? In that case, what is the preferred way for the two
 components to communicate with each other? Thanks!



  Best, Oliver



 Oliver Ruebenacker | Solutions Architect



 Altisource™

 290 Congress St, 7th Floor | Boston, Massachusetts 02210

 P: (617) 728-5582 | ext: 275585

 oliver.ruebenac...@altisource.com | www.Altisource.com



 ***


 This email message and any attachments are intended solely for the use of
 the addressee. If you are not the intended recipient, you are prohibited
 from reading, disclosing, reproducing, distributing, disseminating or
 otherwise using this transmission. If you have received this message in
 error, please promptly notify the sender by reply email and immediately
 delete this message from your system. This message and any attachments
 may contain information that is confidential, privileged or exempt from
 disclosure. Delivery of this message to any person other than the intended
 recipient is not intended to waive any right or privilege. Message
 transmission is not guaranteed to be secure or free of software viruses.

 ***


 ***

 This email message and any attachments are intended solely for the use of
 the addressee. If you are not the intended recipient, you are prohibited
 from reading, disclosing, reproducing, distributing, disseminating or
 otherwise using this transmission. If you have received this message in
 error, please promptly notify the sender by reply email and immediately
 delete this message from your system.
 This message and any attachments may contain information that is
 confidential, privileged or exempt from disclosure. Delivery of this
 message to any person other than the intended recipient is not intended to
 waive any right or privilege. Message transmission is not guaranteed to be
 secure or free of software viruses.

 ***




-- 
Daniel Siegmann, Software Developer
Velos
Accelerating Machine Learning

440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001
E: daniel.siegm...@velos.io W: www.velos.io


R: Spark as a Library

2014-09-16 Thread Paolo Platter
Hi,

Spark job server by ooyala is the right tool for the job. It exposes rest api 
so calling it from a web app is suitable.
Is open source, you can find it on github

Best

Paolo Platter

Da: Ruebenacker, Oliver Amailto:oliver.ruebenac...@altisource.com
Inviato: ‎16/‎09/‎2014 21.18
A: Matei Zahariamailto:matei.zaha...@gmail.com; 
user@spark.apache.orgmailto:user@spark.apache.org
Oggetto: RE: Spark as a Library


 Hello,

  Thanks for the response and great to hear it is possible. But how do I 
connect to Spark without using the submit script?

  I know how to start up a master and some workers and then connect to the 
master by packaging the app that contains the SparkContext and then submitting 
the package with the spark-submit script in standalone-mode. But I don’t want 
to submit the app that contains the SparkContext via the script, because I want 
that app to be running on a web server. So, what are other ways to connect to 
Spark? I can’t find in the docs anything other than using the script. Thanks!

 Best, Oliver

From: Matei Zaharia [mailto:matei.zaha...@gmail.com]
Sent: Tuesday, September 16, 2014 1:31 PM
To: Ruebenacker, Oliver A; user@spark.apache.org
Subject: Re: Spark as a Library

If you want to run the computation on just one machine (using Spark's local 
mode), it can probably run in a container. Otherwise you can create a 
SparkContext there and connect it to a cluster outside. Note that I haven't 
tried this though, so the security policies of the container might be too 
restrictive. In that case you'd have to run the app outside and expose an RPC 
interface between them.

Matei


On September 16, 2014 at 8:17:08 AM, Ruebenacker, Oliver A 
(oliver.ruebenac...@altisource.commailto:oliver.ruebenac...@altisource.com) 
wrote:

 Hello,

  Suppose I want to use Spark from an application that I already submit to run 
in another container (e.g. Tomcat). Is this at all possible? Or do I have to 
split the app into two components, and submit one to Spark and one to the other 
container? In that case, what is the preferred way for the two components to 
communicate with each other? Thanks!

 Best, Oliver

Oliver Ruebenacker | Solutions Architect

Altisource™
290 Congress St, 7th Floor | Boston, Massachusetts 02210
P: (617) 728-5582 | ext: 275585
oliver.ruebenac...@altisource.commailto:oliver.ruebenac...@altisource.com | 
www.Altisource.com

***

This email message and any attachments are intended solely for the use of the 
addressee. If you are not the intended recipient, you are prohibited from 
reading, disclosing, reproducing, distributing, disseminating or otherwise 
using this transmission. If you have received this message in error, please 
promptly notify the sender by reply email and immediately delete this message 
from your system. This message and any attachments may contain information that 
is confidential, privileged or exempt from disclosure. Delivery of this message 
to any person other than the intended recipient is not intended to waive any 
right or privilege. Message transmission is not guaranteed to be secure or free 
of software viruses.
***
***

This email message and any attachments are intended solely for the use of the 
addressee. If you are not the intended recipient, you are prohibited from 
reading, disclosing, reproducing, distributing, disseminating or otherwise 
using this transmission. If you have received this message in error, please 
promptly notify the sender by reply email and immediately delete this message 
from your system. This message and any attachments may contain information that 
is confidential, privileged or exempt from disclosure. Delivery of this message 
to any person other than the intended recipient is not intended to waive any 
right or privilege. Message transmission is not guaranteed to be secure or free 
of software viruses.
***


Building spark with native library support

2014-03-06 Thread Alan Burlison

Hi,

I've successfully built 0.9.0-incubating on Solaris using sbt, following 
the instructions at http://spark.incubator.apache.org/docs/latest/ and 
it seems to work OK. However, when I start it up I get an error about 
missing Hadoop native libraries. I can't find any mention of how to 
build the native components in the instructions, how is that done?


Thanks,

--
Alan Burlison
--


Re: Building spark with native library support

2014-03-06 Thread Matei Zaharia
Is it an error, or just a warning? In any case, you need to get those libraries 
from a build of Hadoop for your platform. Then add them to the 
SPARK_LIBRARY_PATH environment variable in conf/spark-env.sh, or to your 
-Djava.library.path if launching an application separately.

These libraries just speed up some compression codecs BTW, so it should be fine 
to run without them too.

Matei

On Mar 6, 2014, at 9:04 AM, Alan Burlison alan.burli...@oracle.com wrote:

 Hi,
 
 I've successfully built 0.9.0-incubating on Solaris using sbt, following the 
 instructions at http://spark.incubator.apache.org/docs/latest/ and it seems 
 to work OK. However, when I start it up I get an error about missing Hadoop 
 native libraries. I can't find any mention of how to build the native 
 components in the instructions, how is that done?
 
 Thanks,
 
 -- 
 Alan Burlison
 --



RE: Building spark with native library support

2014-03-06 Thread Jeyaraj, Arockia R (Arockia)
Hi,

I am trying to setup Spark in windows for development environment. I get 
following error when I run sbt. Pl help me to resolve this issue. I am working 
for Verizon and am in my company network and can't access internet without 
proxy.

C:\Userssbt
Getting org.fusesource.jansi jansi 1.11 ...
You probably access the destination server through a proxy server that is not we
ll configured.
You probably access the destination server through a proxy server that is not we
ll configured.
You probably access the destination server through a proxy server that is not we
ll configured.

:: problems summary ::
 WARNINGS
Host repo.typesafe.com not found. url=http://repo.typesafe.com/typesafe/
ivy-releases/org.fusesource.jansi/jansi/1.11/ivys/ivy.xml

Host repo1.maven.org not found. url=http://repo1.maven.org/maven2/org/fu
sesource/jansi/jansi/1.11/jansi-1.11.pom

Host repo1.maven.org not found. url=http://repo1.maven.org/maven2/org/fu
sesource/jansi/jansi/1.11/jansi-1.11.jar

module not found: org.fusesource.jansi#jansi;1.11

 local: tried

  C:\Users\v983654\.ivy2\local\org.fusesource.jansi\jansi\1.11\ivys\ivy.
xml

  -- artifact org.fusesource.jansi#jansi;1.11!jansi.jar:

  C:\Users\v983654\.ivy2\local\org.fusesource.jansi\jansi\1.11\jars\jans
i.jar

 typesafe-ivy-releases: tried

  http://repo.typesafe.com/typesafe/ivy-releases/org.fusesource.jansi/ja
nsi/1.11/ivys/ivy.xml

 Maven Central: tried

  http://repo1.maven.org/maven2/org/fusesource/jansi/jansi/1.11/jansi-1.
11.pom

  -- artifact org.fusesource.jansi#jansi;1.11!jansi.jar:

  http://repo1.maven.org/maven2/org/fusesource/jansi/jansi/1.11/jansi-1.
11.jar

::

::  UNRESOLVED DEPENDENCIES ::

::

:: org.fusesource.jansi#jansi;1.11: not found

::



:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
unresolved dependency: org.fusesource.jansi#jansi;1.11: not found
Error during sbt execution: Error retrieving required libraries
  (see C:\Users\v983654\.sbt\boot\update.log for complete log)
Error: Could not retrieve jansi 1.11

Thanks
Arockia Raja

-Original Message-
From: Matei Zaharia [mailto:matei.zaha...@gmail.com] 
Sent: Thursday, March 06, 2014 11:44 AM
To: user@spark.apache.org
Subject: Re: Building spark with native library support

Is it an error, or just a warning? In any case, you need to get those libraries 
from a build of Hadoop for your platform. Then add them to the 
SPARK_LIBRARY_PATH environment variable in conf/spark-env.sh, or to your 
-Djava.library.path if launching an application separately.

These libraries just speed up some compression codecs BTW, so it should be fine 
to run without them too.

Matei

On Mar 6, 2014, at 9:04 AM, Alan Burlison alan.burli...@oracle.com wrote:

 Hi,
 
 I've successfully built 0.9.0-incubating on Solaris using sbt, following the 
 instructions at http://spark.incubator.apache.org/docs/latest/ and it seems 
 to work OK. However, when I start it up I get an error about missing Hadoop 
 native libraries. I can't find any mention of how to build the native 
 components in the instructions, how is that done?
 
 Thanks,
 
 -- 
 Alan Burlison
 --



Re: Building spark with native library support

2014-03-06 Thread Alan Burlison

On 06/03/2014 18:55, Matei Zaharia wrote:


For the native libraries, you can use an existing Hadoop build and
just put them on the path. For linking to Hadoop, Spark grabs it
through Maven, but you can do mvn install locally on your version
of Hadoop to install it to your local Maven cache, and then configure
Spark to use that version. Spark never builds Hadoop itself, it just
downloads it through Maven.


OK, thanks for the pointers.

--
Alan Burlison
--