Re: Discourse: A proposed alternative to the Spark User list
Love it! There is a reason why SO is so effective and popular. Search is excellent, you can quickly find very thoughtful answers about sometimes thorny problems, and it is easy to contribute, format code, etc. Perhaps the most useful feature is that the best answers naturally bubble up to the top, so these are the ones you see first. One annoyance is the troll phenomenon, see e.g. http://michael.richter.name/blogs/why-i-no-longer-contribute-to-stackoverflow (that also mentions other pet peeves about SO). That phenomenon is, IMHO, most prevalent on the stackoverflow itself, perhaps less so on other stackexchange sites. At the same time, I do appreciate the pressure to provide well-written, concise, and for the posterity questions and answers. That peer pressure is what, to a good extent, makes the material on SO so valuable and useful. It is probably a tricky balance to strike. A dedicated stackexchange site for Apache Spark sounds to me like the logical solution. Less trolling, more enthusiasm, and with the participation of the people on this list, I think it would very quickly become the reference for many technical questions, as well as a great vehicle to promote the awesomeness of Spark. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Discourse-A-proposed-alternative-to-the-Spark-User-list-tp20851p21321.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: wholeTextFiles not working with HDFS
I had the same issue with spark-1.0.2-bin-hadoop*1*, and indeed the issue seems related to Hadoop1. When switching to using spark-1.0.2-bin-hadoop*2*, the issue disappears. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/wholeTextFiles-not-working-with-HDFS-tp7490p12677.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Development environment issues
Hello all, I am trying to get productive with Spark and Scala but I haven't figured out a good development environment yet. Coming from the eclipse/java/ant/ivy/hadoop world, I understand that I have a steep learning curve ahead of me, but still I would expect to be able to relatively quickly settle in an environment that easily lets me do the following tasks: 1. debug locally (being able to set breakpoints, inspect variables, single-step if necessary, etc.) 2. run individual tests and test suites, manually as well as using a CI e.g. Jenkins 3. deploy to an EC2 cluster and monitor the jobs So far it has been quite frustrating. I can get things going at the command line via sbt, but I'd like to do the code-edit-test-debug neatly and seamlessly into an IDE. I recently updated my JDK to 1.8.0_11 (breaking a bunch of things in the process). Ideally I'd like to use Scala 2.11.2 and its new features (although at the moment there are several libraries that I use that aren't available yet for scala-2.11). As a longtime Eclipse user, I am so far kind of disappointed with its scala/sbt support. For the Eclipse version I use (Luna), I found a milestone version of ScalaIDE that seems to more or less work, but I don't seem to be able to run tests (it crashes). Without getting too deep in the details, to say the least the whole stack (the way I set it up, anyway) seems a bit wobbly. So, what is the accepted wisdom in terms of IDE and development environment? Is there a good tutorial to set things up so that one half of the libraries/tools doesn't break the other half? What do you guys use? scala 2.10 or 2.11? sbt or maven? eclipse or idea? jdk7 or 8? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Development-environment-issues-tp12611.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org