Re: How to specify default value for StructField?
You can try the below code. val df = spark.read.format("orc").load("/user/hos/orc_files_test_together") df.select(“f1”,”f2”).show 在 2017/2/14
Re: why does spark web UI keeps changing its port?
Config your spark master web ui you can set env SPARK_MASTER_WEBUI_PORT= You can running cmd netstat –nao|grep 4040 to check 4040 is in using ——— I am not sure why Spark web UI keeps changing its port every time I restart a cluster? how can I make it run always on one port? I did make sure there is no process running on 4040(spark default web ui port) however it still starts at 8080. any ideas? MasterWebUI: Bound MasterWebUI to 0.0.0.0, and started at http://x.x.x.x:8080 Thanks!
Re: Writing Spark SQL output in Local and HDFS path
It’s since spark version 2.0.0, if you are using under the version, you can try the below code. result.write.format("csv").save(path) -- Hi, I tried the below code, as result.write.csv(home/Prasad/) It is not working, It says Error: value csv is not member of org.apache.spark.sql.DataFrameWriter. Regards Prasad On Thu, Jan 19, 2017 at 4:35 PM, smartzjp <zjp_j...@163.com> wrote: Beacause the reduce number will be not one, so it will out put a fold on the HDFS, You can use “result.write.csv(foldPath)”. -- Hi, Can anyone please let us know how to write the output of the Spark SQL in Local and HDFS path using Scala code. Code :- scala> val result = sqlContext.sql("select empno , name from emp"); scala > result.show(); If I give the command result.show() then It will print the output in the console. I need to redirect the output in local file as well as HDFS file. with the delimiter as "|". We tried with the below code result.saveAsTextFile ("home/Prasad/result.txt") It is not working as expected. -- -- Prasad. T -- -- Regards, RAVI PRASAD. T
Re: Writing Spark SQL output in Local and HDFS path
Beacause the reduce number will be not one, so it will out put a fold on the HDFS, You can use “result.write.csv(foldPath)”. -- Hi, Can anyone please let us know how to write the output of the Spark SQL in Local and HDFS path using Scala code. Code :- scala> val result = sqlContext.sql("select empno , name from emp"); scala > result.show(); If I give the command result.show() then It will print the output in the console. I need to redirect the output in local file as well as HDFS file. with the delimiter as "|". We tried with the below code result.saveAsTextFile ("home/Prasad/result.txt") It is not working as expected. -- -- Prasad. T
Re: how the sparksession initialization, set currentDatabase value?
I think if you want to run spark sql on CLI this configuration will be ok, but if you want to run with distributed query engine, start the JDBC/ODBC server and set the hive address info. You can reference this description for more detail. http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine - Spark read hive table, catalog. CurrentDatabase value is the default, how the sparksession initialization, set currentDatabase value? hive.metastore.uris thrift://localhost:9083 IP address (or fully-qualified domain name) and port of the metastore host
Re: Spark 2.0.2, KyroSerializer, double[] is not registered.
You can have a try the following code. ObjectArraySerializer serializer = new ObjectArraySerializer(kryo, Double[].class); kryo.register(Double[].class, serializer); --- Hi, all. I enable kyro in spark with spark-defaults.conf: spark.serializer org.apache.spark.serializer.KryoSerializer spark.kryo.registrationRequired true A KryoException is raised when a logistic regression algorithm is running: Note: To register this class use: kryo.register(double[].class); Serialization trace: currL1 (org.apache.spark.mllib.stat.MultivariateOnlineSummarizer) at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:585) at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213) at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:568) at com.twitter.chill.Tuple2Serializer.write(TupleSerializers.scala:36) at com.twitter.chill.Tuple2Serializer.write(TupleSerializers.scala:33) at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:568) My question is: Doesn't double[].class be supported by default? Thanks.