I assume by "The same code perfectly works through Zeppelin 0.5.5" that you're using the %sql interpreter with your regular SQL SELECT statement, correct?
If so, the Zeppelin interpreter is converting the <sql-statement> that follows %sql to sqlContext.sql(<sql-statement>) per the following code: https://github.com/apache/incubator-zeppelin/blob/01f4884a3a971ece49d668a9783d6b705cf6dbb5/spark/src/main/java/org/apache/zeppelin/spark/SparkSqlInterpreter.java#L125 https://github.com/apache/incubator-zeppelin/blob/01f4884a3a971ece49d668a9783d6b705cf6dbb5/spark/src/main/java/org/apache/zeppelin/spark/SparkSqlInterpreter.java#L140-L141 Also, keep in mind that you can do something like this if you want to stay in DataFrame land: df.selectExpr("*").limit(5).show() On Fri, Dec 25, 2015 at 12:53 PM, Eugene Morozov <evgeny.a.moro...@gmail.com > wrote: > Ted, Igor, > > Oh my... thanks a lot to both of you! > Igor was absolutely right, but I missed that I have to use sqlContext =( > > Everything's perfect. > Thank you. > > -- > Be well! > Jean Morozov > > On Fri, Dec 25, 2015 at 8:31 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> DataFrame uses different syntax from SQL query. >> I searched unit tests but didn't find any in the form of df.select("select >> ...") >> >> Looks like you should use sqlContext as other people suggested. >> >> On Fri, Dec 25, 2015 at 8:29 AM, Eugene Morozov < >> evgeny.a.moro...@gmail.com> wrote: >> >>> Thanks for the comments, although the issue is not in limit() predicate. >>> It's something with spark being unable to resolve the expression. >>> >>> I can do smth like this. It works as it suppose to: >>> df.select(df.col("*")).where(df.col("x1").equalTo(3.0)).show(5); >>> >>> But I think old fashioned sql style have to work also. I have >>> df.registeredTempTable("tmptable") and then >>> >>> df.select("select * from tmptable where x1 = '3.0'").show(); >>> >>> org.apache.spark.sql.AnalysisException: cannot resolve 'select * from >>> tmp where x1 = '1.0'' given input columns x1, x4, x5, x3, x2; >>> >>> at >>> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42) >>> at >>> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:56) >>> at >>> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.sca >>> >>> >>> From the first statement I conclude that my custom datasource is >>> perfectly fine. >>> Just wonder how to fix / workaround that. >>> -- >>> Be well! >>> Jean Morozov >>> >>> On Fri, Dec 25, 2015 at 6:13 PM, Igor Berman <igor.ber...@gmail.com> >>> wrote: >>> >>>> sqlContext.sql("select * from table limit 5").show() (not sure if limit >>>> 5 supported) >>>> >>>> or use Dmitriy's solution. select() defines your projection when you've >>>> specified entire query >>>> >>>> On 25 December 2015 at 15:42, Василец Дмитрий <pronix.serv...@gmail.com >>>> > wrote: >>>> >>>>> hello >>>>> you can try to use df.limit(5).show() >>>>> just trick :) >>>>> >>>>> On Fri, Dec 25, 2015 at 2:34 PM, Eugene Morozov < >>>>> evgeny.a.moro...@gmail.com> wrote: >>>>> >>>>>> Hello, I'm basically stuck as I have no idea where to look; >>>>>> >>>>>> Following simple code, given that my Datasource is working gives me >>>>>> an exception. >>>>>> >>>>>> DataFrame df = sqlc.load(filename, >>>>>> "com.epam.parso.spark.ds.DefaultSource"); >>>>>> df.cache(); >>>>>> df.printSchema(); <-- prints the schema perfectly fine! >>>>>> >>>>>> df.show(); <-- Works perfectly fine (shows table >>>>>> with 20 lines)! >>>>>> df.registerTempTable("table"); >>>>>> df.select("select * from table limit 5").show(); <-- gives weird >>>>>> exception >>>>>> >>>>>> Exception is: >>>>>> >>>>>> AnalysisException: cannot resolve 'select * from table limit 5' given >>>>>> input columns VER, CREATED, SOC, SOCC, HLTC, HLGTC, STATUS >>>>>> >>>>>> I can do a collect on a dataframe, but cannot select any specific >>>>>> columns either "select * from table" or "select VER, CREATED from table". >>>>>> >>>>>> I use spark 1.5.2. >>>>>> The same code perfectly works through Zeppelin 0.5.5. >>>>>> >>>>>> Thanks. >>>>>> -- >>>>>> Be well! >>>>>> Jean Morozov >>>>>> >>>>> >>>>> >>>> >>> >> > -- *Chris Fregly* Principal Data Solutions Engineer IBM Spark Technology Center, San Francisco, CA http://spark.tc | http://advancedspark.com