Re: Issue with pyspark 1.3.0, sql package and rows

2015-04-08 Thread Davies Liu
I will look into this today.

On Wed, Apr 8, 2015 at 7:35 AM, Stefano Parmesan parme...@spaziodati.eu wrote:
 Did anybody by any chance had a look at this bug? It keeps on happening to
 me, and it's quite blocking, I would like to understand if there's something
 wrong in what I'm doing, or whether there's a workaround or not.

 Thank you all,

 --
 Dott. Stefano Parmesan
 Backend Web Developer and Data Lover ~ SpazioDati s.r.l.
 Via Adriano Olivetti, 13 – 4th floor
 Le Albere district – 38122 Trento – Italy




 --
 View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/Issue-with-pyspark-1-3-0-sql-package-and-rows-tp22405p22423.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Issue with pyspark 1.3.0, sql package and rows

2015-04-08 Thread Stefano Parmesan
Did anybody by any chance had a look at this bug? It keeps on happening to
me, and it's quite blocking, I would like to understand if there's something
wrong in what I'm doing, or whether there's a workaround or not.

Thank you all,

-- 
Dott. Stefano Parmesan
Backend Web Developer and Data Lover ~ SpazioDati s.r.l.
Via Adriano Olivetti, 13 – 4th floor
Le Albere district – 38122 Trento – Italy




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Issue-with-pyspark-1-3-0-sql-package-and-rows-tp22405p22423.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Issue with pyspark 1.3.0, sql package and rows

2015-04-07 Thread Stefano Parmesan
Hi all,

I've already opened a bug on Jira some days ago [1] but I'm starting
thinking this is not the correct way to go since I haven't got any news
about it yet.

Let me try to explain it briefly: with pyspark, trying to cogroup two input
files with different schemas lead (nondeterministically) to some wrong
behaviour: the object coming from the first input will have the fields of
the second one (or vice-versa); the important fact is that the data in the
row is actually correct, what's wrong is the content of the __FIELDS__ on
the rows.

Attached to the issue I posted a small snippet to reproduce the issue
(which is a gist [2]).

Does this happen to others as well? Is it a known issue? Am I doing
anything wrong?

Thank you all,

[1]: https://issues.apache.org/jira/browse/SPARK-6677
[2]: https://gist.github.com/armisael/e08bb4567d0a11efe2db

-- 
Dott. Stefano Parmesan
Backend Web Developer and Data Lover ~ SpazioDati s.r.l.
Via Adriano Olivetti, 13 – 4th floor
Le Albere district – 38122 Trento – Italy


Issue with pyspark 1.3.0, sql package and rows

2015-04-07 Thread Stefano Parmesan
Hi all,

I've already opened a bug on Jira some days ago [1] but I'm starting
thinking this is not the correct way to go since I haven't got any news
about it yet.

Let me try to explain it briefly: with pyspark, trying to cogroup two input
files with different schemas lead (nondeterministically) to some wrong
behaviour: the object coming from the first input will have the fields of
the second one (or vice-versa); the important fact is that the data in the
row is actually correct, what's wrong is the content of the __FIELDS__ on
the rows.

Attached to the issue I posted a small snippet to reproduce the issue
(which is a gist [2]).

Does this happen to others as well? Is it a known issue? Am I doing
anything wrong?

Thank you all,

[1]: https://issues.apache.org/jira/browse/SPARK-6677
[2]: https://gist.github.com/armisael/e08bb4567d0a11efe2db

-- 
Dott. Stefano Parmesan
Backend Web Developer and Data Lover ~ SpazioDati s.r.l.
Via Adriano Olivetti, 13 – 4th floor
Le Albere district – 38122 Trento – Italy




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Issue-with-pyspark-1-3-0-sql-package-and-rows-tp22405.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.