It is a space separated data, just as below

 And What is your thought about the second issue?
Thank you.




At 2015-08-10 15:20:39, "Akhil Das" <ak...@sigmoidanalytics.com> wrote:

Isnt it a space separated data? It is not a comma(,) separated nor pipe (|) 
separated data.


Thanks
Best Regards


On Mon, Aug 10, 2015 at 12:06 PM, Netwaver <wanglong_...@163.com> wrote:

Hi Spark experts,
                 I am now using Spark 1.4.1 and trying Spark SQL/DataFrame API 
with text file in below format
                        id gender height
                        1  M  180
                        2  F   167
                        ... ...
                 But I meet issues as described below:
                 1.  In my test program, I specify the schema programmatically, 
but when I use "|" as the separator in schema string, the code run into below 
exception when being executed on the cluster(Standalone)
                  
                   When I use "," as the separator, everything works fine.
                  2.  In the code, when I use DataFrame.agg() function with 
same column name is used for different statistics functions(max,min,avg)
                      valpeopleDF = sqlCtx.createDataFrame(rowRDD, schema)
                      
peopleDF.filter(peopleDF("gender").equalTo("M")).agg(Map("height" -> 
"avg","height" -> "max","height" -> "min")).show()     
                    I just find only the last function's computation result is 
shown(as below), Does this work as design in Spark?
                                 
                 Hopefully I have described the "issue" clearly, and please 
feel free to correct me if have done something wrong, thanks a lot.








Reply via email to