Re: How to specify file

2016-09-23 Thread Mich Talebzadeh
You can do the following with option("delimiter") ..


val df = spark.read.option("header",
false).option("delimiter","\t").csv("hdfs://rhes564:9000/tmp/nw_10124772.tsv")

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 23 September 2016 at 07:56, Sea <261810...@qq.com> wrote:

> Hi, I want to run sql directly on files, I find that spark has supported
> sql like select * from csv.`/path/to/file`, but files may not be split by
> ','. Maybe it is split by '\001', how can I specify delimiter?
>
> Thank you!
>
>
>


?????? How to specify file

2016-09-23 Thread Sea
Hi, Hemant, Aditya:
I don't want to create temp table and write code, I just want to run sql 
directly on files "select * from csv.`/path/to/file`"





--  --
??: "Hemant Bhanawat";<hemant9...@gmail.com>;
: 2016??9??23??(??) 3:32
??: "Sea"<261810...@qq.com>; 
: "user"<user@spark.apache.org>; 
: Re: How to specify file




Check out the READEME on the following page. This is the csv connector that you 
are using. I think you need to specify the delimiter option.  

https://github.com/databricks/spark-csv

Hemant Bhanawat

www.snappydata.io 







 
On Fri, Sep 23, 2016 at 12:26 PM, Sea <261810...@qq.com> wrote:
Hi, I want to run sql directly on files, I find that spark has supported sql 
like select * from csv.`/path/to/file`, but files may not be split by ','. 
Maybe it is split by '\001', how can I specify delimiter?

Thank you!

Re: How to specify file

2016-09-23 Thread Aditya

Hi Sea,

For using Spark SQL you will need to create DataFrame from the file and 
then execute select * on dataframe.

In your case you will need to do something like this

JavaRDD DF = context.textFile("path");
JavaRDD rowRDD3 = DF.map(new Function() {
public Row call(String record) throws Exception {
String[] fields = record.split("\001");
Row createRow = createRow(fields);
return createRow;
}
});
DataFrame ResultDf3 = hiveContext.createDataFrame(rowRDD3, schema);
ResultDf3.registerTempTable("test")
hiveContext.sql("select * from test");

You will need to create schema for the file first just like how you have 
created for csv file.





On Friday 23 September 2016 12:26 PM, Sea wrote:
Hi, I want to run sql directly on files, I find that spark has 
supported sql like select * from csv.`/path/to/file`, but files may 
not be split by ','. Maybe it is split by '\001', how can I specify 
delimiter?


Thank you!








Re: How to specify file

2016-09-23 Thread Hemant Bhanawat
Check out the READEME on the following page. This is the csv connector that
you are using. I think you need to specify the delimiter option.

https://github.com/databricks/spark-csv

Hemant Bhanawat 
www.snappydata.io

On Fri, Sep 23, 2016 at 12:26 PM, Sea <261810...@qq.com> wrote:

> Hi, I want to run sql directly on files, I find that spark has supported
> sql like select * from csv.`/path/to/file`, but files may not be split by
> ','. Maybe it is split by '\001', how can I specify delimiter?
>
> Thank you!
>
>
>


How to specify file

2016-09-23 Thread Sea
Hi, I want to run sql directly on files, I find that spark has supported sql 
like select * from csv.`/path/to/file`, but files may not be split by ','. 
Maybe it is split by '\001', how can I specify delimiter?

Thank you!