try running explain on each of these. my guess would be caching in broken in some cases.
On Tue, Aug 16, 2016 at 6:05 PM, Jacek Laskowski <ja...@japila.pl> wrote: > Hi, > > Can anyone explain why spark.read.csv("people.csv").cache.show ends up > with a WARN while spark.read.text("people.csv").cache.show does not? > It happens in 2.0 and today's build. > > scala> sc.version > res5: String = 2.1.0-SNAPSHOT > > scala> spark.read.csv("people.csv").cache.show > +---------+---------+-------+----+ > | _c0| _c1| _c2| _c3| > +---------+---------+-------+----+ > |kolumna 1|kolumna 2|kolumn3|size| > | Jacek| Warszawa| Polska| 40| > +---------+---------+-------+----+ > > scala> spark.read.csv("people.csv").cache.show > 16/08/16 18:01:52 WARN CacheManager: Asked to cache already cached data. > +---------+---------+-------+----+ > | _c0| _c1| _c2| _c3| > +---------+---------+-------+----+ > |kolumna 1|kolumna 2|kolumn3|size| > | Jacek| Warszawa| Polska| 40| > +---------+---------+-------+----+ > > scala> spark.read.text("people.csv").cache.show > +--------------------+ > | value| > +--------------------+ > |kolumna 1,kolumna...| > |Jacek,Warszawa,Po...| > +--------------------+ > > scala> spark.read.text("people.csv").cache.show > +--------------------+ > | value| > +--------------------+ > |kolumna 1,kolumna...| > |Jacek,Warszawa,Po...| > +--------------------+ > > Pozdrawiam, > Jacek Laskowski > ---- > https://medium.com/@jaceklaskowski/ > Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark > Follow me at https://twitter.com/jaceklaskowski > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >