Re: BUG: 1.3.0 org.apache.spark.sql.Row Does not exist in Java API

Olivier Girardot Sat, 18 Apr 2015 00:28:49 -0700

Hi Nipun,
you're right, I created the pull request fixing the documentation:
https://github.com/apache/spark/pull/5569
and the corresponding issue:
https://issues.apache.org/jira/browse/SPARK-6992
Thank you for your time,


Olivier.

Le sam. 18 avr. 2015 à 01:11, Nipun Batra <batrani...@gmail.com> a écrit :

> Hi Oliver
>
> Thank you for responding.
>
> I am able to find org.apache.spark.sql.Row in spark-catalyst_2.10-1.3.0,
> BUT it was not visible in API document yesterday (
> https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/package-frame.html).
> I am pretty sure.
>
> Also I think this document needs to be changed '
> https://spark.apache.org/docs/latest/sql-programming-guide.html'
>
> return Row.create(fields[0], fields[1].trim());
>
>
> needs to be replaced with RowFactory.create.
>
> Thanks again for your reponse.
>
> Thanks
> Nipun Batra
>
>
>
> On Fri, Apr 17, 2015 at 2:50 PM, Olivier Girardot <ssab...@gmail.com>
> wrote:
>
>> Hi Nipun,
>> I'm sorry but I don't understand exactly what your problem is ?
>> Regarding the org.apache.spark.sql.Row, it does exists in the Spark SQL
>> dependency.
>> Is it a compilation problem ?
>> Are you trying to run a main method using the pom you've just described ?
>> or are you trying to spark-submit the jar ?
>> If you're trying to run a main method, the scope provided is not designed
>> for that and will make your program fail.
>>
>> Regards,
>>
>> Olivier.
>>
>> Le ven. 17 avr. 2015 à 21:52, Nipun Batra <bni...@gmail.com> a écrit :
>>
>>> Hi
>>>
>>> The example given in SQL document
>>> https://spark.apache.org/docs/latest/sql-programming-guide.html
>>>
>>> org.apache.spark.sql.Row Does not exist in Java API or atleast I was not
>>> able to find it.
>>>
>>> Build Info - Downloaded from spark website
>>>
>>> Dependency
>>>                 <dependency>
>>> <groupId>org.apache.spark</groupId>
>>> <artifactId>spark-sql_2.10</artifactId>
>>> <version>1.3.0</version>
>>> <scope>provided</scope>
>>> </dependency>
>>>
>>> Code in documentation
>>>
>>> // Import factory methods provided by DataType.import
>>> org.apache.spark.sql.types.DataType;// Import StructType and
>>> StructFieldimport org.apache.spark.sql.types.StructType;import
>>> org.apache.spark.sql.types.StructField;// Import Row.import
>>> org.apache.spark.sql.Row;
>>> // sc is an existing JavaSparkContext.SQLContext sqlContext = new
>>> org.apache.spark.sql.SQLContext(sc);
>>> // Load a text file and convert each line to a
>>> JavaBean.JavaRDD<String> people =
>>> sc.textFile("examples/src/main/resources/people.txt");
>>> // The schema is encoded in a stringString schemaString = "name age";
>>> // Generate the schema based on the string of schemaList<StructField>
>>> fields = new ArrayList<StructField>();for (String fieldName:
>>> schemaString.split(" ")) {
>>>   fields.add(DataType.createStructField(fieldName,
>>> DataType.StringType, true));}StructType schema =
>>> DataType.createStructType(fields);
>>> // Convert records of the RDD (people) to Rows.JavaRDD<Row> rowRDD =
>>> people.map(
>>>   new Function<String, Row>() {
>>>     public Row call(String record) throws Exception {
>>>       String[] fields = record.split(",");
>>>       return Row.create(fields[0], fields[1].trim());
>>>     }
>>>   });
>>> // Apply the schema to the RDD.DataFrame peopleDataFrame =
>>> sqlContext.createDataFrame(rowRDD, schema);
>>> // Register the DataFrame as a
>>> table.peopleDataFrame.registerTempTable("people");
>>> // SQL can be run over RDDs that have been registered as
>>> tables.DataFrame results = sqlContext.sql("SELECT name FROM people");
>>> // The results of SQL queries are DataFrames and support all the
>>> normal RDD operations.// The columns of a row in the result can be
>>> accessed by ordinal.List<String> names = results.map(new Function<Row,
>>> String>() {
>>>   public String call(Row row) {
>>>     return "Name: " + row.getString(0);
>>>   }
>>>
>>> }).collect();
>>>
>>>
>>> Thanks
>>> Nipun
>>>
>>
>

Re: BUG: 1.3.0 org.apache.spark.sql.Row Does not exist in Java API

Reply via email to