Re: Discourse: A proposed alternative to the Spark User list

2015-01-22 Thread pierred
Love it!

There is a reason why SO is so effective and popular.  Search is excellent,
you can quickly find very thoughtful answers about sometimes thorny
problems, and it is easy to contribute, format code, etc.  Perhaps the most
useful feature is that the best answers naturally bubble up to the top, so
these are the ones you see first.

One annoyance is the troll phenomenon, see e.g.
http://michael.richter.name/blogs/why-i-no-longer-contribute-to-stackoverflow
(that also mentions other pet peeves about SO).  That phenomenon is, IMHO,
most prevalent on the stackoverflow itself, perhaps less so on other
stackexchange sites.

At the same time, I do appreciate the pressure to provide well-written,
concise, and for the posterity questions and answers.  That peer pressure
is what, to a good extent, makes the material on SO so valuable and useful. 
It is probably a tricky balance to strike.

A dedicated stackexchange site for Apache Spark sounds to me like the
logical solution.  Less trolling, more enthusiasm, and with the
participation of the people on this list, I think it would very quickly
become the reference for many technical questions, as well as a great
vehicle to promote the awesomeness of Spark.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Discourse-A-proposed-alternative-to-the-Spark-User-list-tp20851p21321.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: wholeTextFiles not working with HDFS

2014-08-22 Thread pierred
I had the same issue with spark-1.0.2-bin-hadoop*1*, and indeed the issue
seems related to Hadoop1.  When switching to using
spark-1.0.2-bin-hadoop*2*, the issue disappears.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/wholeTextFiles-not-working-with-HDFS-tp7490p12677.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Development environment issues

2014-08-21 Thread pierred
Hello all,

I am trying to get productive with Spark and Scala but I haven't figured out
a good development environment yet.

Coming from the eclipse/java/ant/ivy/hadoop world, I understand that I have
a steep learning curve ahead of me, but still I would expect to be able to
relatively quickly settle in an environment that easily lets me do the
following tasks:

1. debug locally  (being able to set breakpoints, inspect variables,
single-step if necessary, etc.)

2. run individual tests and test suites, manually as well as using a CI e.g.
Jenkins

3. deploy to an EC2 cluster and monitor the jobs

So far it has been quite frustrating. I can get things going at the command
line via sbt, but I'd like to do the code-edit-test-debug neatly and
seamlessly into an IDE.

I recently updated my JDK to 1.8.0_11 (breaking a bunch of things in the
process). Ideally I'd like to use Scala 2.11.2 and its new features
(although at the moment there are several libraries that I use that aren't
available yet for scala-2.11).  As a longtime Eclipse user, I am so far kind
of disappointed with its scala/sbt support.  For the Eclipse version I use
(Luna), I found a milestone version of ScalaIDE that seems to more or less
work, but I don't seem to be able to run tests (it crashes).

Without getting too deep in the details, to say the least the whole stack
(the way I set it up, anyway) seems a bit wobbly.

So, what is the accepted wisdom in terms of IDE and development environment?
Is there a good tutorial to set things up so that one half of the
libraries/tools doesn't break the other half?

What do you guys use?
scala 2.10 or 2.11?
sbt or maven?
eclipse or idea?
jdk7 or 8?




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Development-environment-issues-tp12611.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org