Apache Spark : spark.eventLog.dir on Windows Environment
Hi All, I am working on Spark 1.4 on windows environment. I have to set eventLog directory so that I can reopen the Spark UI after application has finished. But I am not able to set eventLog.dir, It gives an error on Windows environment. Configuation is : Exception I get : java.io.IOException: Cannot run program "cygpath": CreateProcess error=2, The system cannot find the file specified at java.lang.ProcessBuilder.start(Unknown Source) at org.apache.hadoop.util.Shell.runCommand(Shell.java:206) I have also tried installing Cygwin but still the error doesn't go. Can anybody give any advice on it? I have posted the same question on Stackoverflow as well : http://stackoverflow.com/questions/31468716/apache-spark-spark-eventlog-dir-on-windows-environment Thanks Nitin -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-spark-eventLog-dir-on-Windows-Environment-tp23913.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: JdbcRDD and ClassTag issue
Thanks Sujee :) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/JdbcRDD-and-ClassTag-issue-tp18570p23912.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Debugging Apache Spark clustered application from Eclipse
I am trying to debug Spark application running on eclipse in clustered/distributed environment but not able to succeed. Application is java based and I am running it through Eclipse. Configurations to spark for Master/worker is provided through Java only. Though I can debug the code on driver side but as the code flow moves in Spark(i.e call to .map(..)), the debugger doesn't stop. Because that code is running in Workers JVM. Is there anyway I can achieve this ? I have tried giving following configurations in Tomcat through Eclipse : -Xdebug -Xrunjdwp:server=y,transport=dt_socket,address=7761,suspend=n and setting respective port in Debug->remote java application. But after these settings I get the error: Failed to connect to remote VM. Connection Refused Note: I have tried this on Windows as well as on Linux(CentOs) environment. If anybody has any solution to this, please help. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Debugging-Apache-Spark-clustered-application-from-Eclipse-tp23483.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Spark and insertion into RDBMS/NoSQL
Hi All, We are exploring insertion into RDBMS(SQL Server) through Spark by JDBC Driver. The excerpt from the code is as follows : We are doing insertion inside an action : Integer res = flatMappedRDD.reduce(new Function2(){ //@Override public Integer call(Integer arg0, Integer arg1) throws Exception { url = "jdbc:jtds:sqlserver://10.189.220.12:1433/exampleDB"; jdbcdriver = "com.mysql.jdbc.Driver"; USER= "clienta"; PASS= "clienta"; try { Class.forName("net.sourceforge.jtds.jdbc.Driver"); con = DriverManager.getConnection(url, USER, PASS); } catch (SQLException e) { e.printStackTrace(); } catch (ClassNotFoundException e) { e.printStackTrace(); } Statement s = con.createStatement(); System.out.println("Inserting into DB : - " + (arg0 + arg1)); String sql= "INSERT INTO Table_1 VALUES("+(arg0 + arg1)+")"; s.executeUpdate(sql); } 1) Is this the right way to this ? 2) Does every node creates a new connection object ? If Yes, is there a overhead of multiple connection object creations ? How to resolve this ? 3) What is the best practice to create a connection object for insertion or general update into Database using spark ? Thanks Nitin -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-and-insertion-into-RDBMS-NoSQL-tp18717.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
JdbcRDD and ClassTag issue
Hi All, I am trying to access SQL Server through JdbcRDD. But getting error on ClassTag place holder. Here is the code which I wrote public void readFromDB() { String sql = "Select * from Table_1 where values >= ? and values <= ?"; class GetJDBCResult extends AbstractFunction1 { public Integer apply(ResultSet rs) { Integer result = null; try { result = rs.getInt(1); } catch (SQLException e) { // TODO Auto-generated catch block e.printStackTrace(); } return result; } } JdbcRDD jdbcRdd = new JdbcRDD(sc, jdbcInitialization(), sql, 0, 120, 2, new GetJDBCResult(), scala.reflect.ClassTag$.MODULE$.apply(Object.class)); } Can anybody here recommend any solution to this ? Thanks Nitin -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/JdbcRDD-and-ClassTag-issue-tp18570.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org