I had once worked on a named row feature but haven’t got time to finish it. It looks like this:

|sql("...").named.map { row:NamedRow  =>
  row[Int]('key) -> row[String]('value)
}
|

Basically the |named| method generates a field name to ordinal map for each RDD partition. This map is then shared shared by all |NamedRow| instances within a partition. Not exactly what you want, but might be helpful.

Cheng

On 1/20/15 3:39 AM, Night Wolf wrote:

In Spark SQL we have|Row|objects which contain a list of fields that make up a row. A|Row|has ordinal accessors such as|.getInt(0)|or|getString(2)|.

Say ordinal 0 = ID and ordinal 1 = Name. It becomes hard to remember what ordinal is what, making the code confusing.

Say for example I have the following code

|def  doStuff(row:  Row)  =  {
   //extract some items from the row into a tuple;
   (row.getInt(0),  row.getString(1))  //tuple of ID, Name
}|

The question becomes how could I create aliases for these fields in a Row object?

I was thinking I could create methods which take a implicit Row object;

|def  id(implicit  row:  Row)  =  row.getInt(0)
def  name(implicit  row:  Row)  =  row.getString(1)|

I could then rewrite the above as;

|def  doStuff(implicit  row:  Row)  =  {
   //extract some items from the row into a tuple;
   (id,  name)  //tuple of ID, Name
}|

Is there a better/neater approach?


Cheers,

~NW

Reply via email to