I am not 100% sure but probably flatMap unwinds the tuples. Try with map
instead.

2015-08-19 13:10 GMT+02:00 Jerry OELoo <oylje...@gmail.com>:

> Hi.
> I want to parse a file and return a key-value pair with pySpark, but
> result is strange to me.
> the test.sql is a big fie and each line is usename and password, with
> # between them, I use below mapper2 to map data, and in my
> understanding, i in words.take(10) should be a tuple, but the result
> is that i is username or password, this is strange for me to
> understand, Thanks for you help.
>
> def mapper2(line):
>
>     words = line.split('#')
>     return (words[0].strip(), words[1].strip())
>
> def main2(sc):
>
>     lines = sc.textFile("hdfs://master:9000/spark/test.sql")
>     words = lines.flatMap(mapper2)
>
>     for i in words.take(10):
>         msg = i + ":" + "\n"
>
>
> --
> Rejoice,I Desire!
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to