Thanks for the message.

There is an open issue about the public type / schema system that is
related to this topic: https://issues.apache.org/jira/browse/SPARK-2179

You probably want to comment on that ticket as well.



On Sat, Jun 21, 2014 at 7:52 AM, guxiaobo1982 <guxiaobo1...@qq.com> wrote:

> Hi ,
>
> The current implementation of JavaSchemaRDD need a special JavaBean class
> to define schema information for tables, but when developing applications
> using the Spark SQL API, table is a more dynamic component, the awkward
> thing is when new tables are defined, we must create a new JavaBean, and
> redeploy the whole application. So here comes an idea regarding a more
> general schema registration method,
>
>
>
> Step1: Defile a new Java class named RowSchema in API to define column
> information, column name and data type are most important ones.
>
>
>
> Step2: the actual data is store just as JavaRDD<Row>;
>
>
>
> Step3:When loading data into JavaRDD<Row>, the API provides a general map
> function, which takes a RowSchema object as parameter, to map each line to
> a Row object.
>
>
>
> Step4: Add a new applySchema method , which takes a RowSchema object as
> parameter, to the JavaSQLContext class ,
>
>
>
> Step 5: The registerAsTable and all other SQL releated methods of
> JavaSQLContext class should take care of the difference of defining schema
> throw JavaBean and RowSchemas.(That’s the work of the API layer)
>
>
>
> The API is something like this:
>
>
>
> Public Class RowSchema{
>
> Public RowSchema(List<String> colNames, List<String> colDataTypes);
>
> Public String getColName(integer i);//return column string of column I;
>
> Public integer getColDataType(integer i);//return data type of column I;
>
> Public integer getColNumber();// return number of columns
>
>
>
> };
>
> RowSchema rs = new RowSchema(……);
>
>
>
> JavaRDD<Row> table = ctx.textFile(“file path”).map(rs);
>
>
>
> JavaSchemaRDD schemaPeople = sqlCtx.applySchema(table, rs);
>
> schemaPeople.registerAsTable("people");
>
> Regards,
>
>
>  Xiaobo Gu

Reply via email to