Thanks for the comments, although the issue is not in limit() predicate.
It's something with spark being unable to resolve the expression.

I can do smth like this. It works as it suppose to:
 df.select(df.col("*")).where(df.col("x1").equalTo(3.0)).show(5);

But I think old fashioned sql style have to work also. I have
df.registeredTempTable("tmptable") and then

df.select("select * from tmptable where x1 = '3.0'").show();

org.apache.spark.sql.AnalysisException: cannot resolve 'select * from tmp
where x1 = '1.0'' given input columns x1, x4, x5, x3, x2;

at
org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
at
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:56)
at
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.sca


>From the first statement I conclude that my custom datasource is perfectly
fine.
Just wonder how to fix / workaround that.
--
Be well!
Jean Morozov

On Fri, Dec 25, 2015 at 6:13 PM, Igor Berman <igor.ber...@gmail.com> wrote:

> sqlContext.sql("select * from table limit 5").show() (not sure if limit 5
> supported)
>
> or use Dmitriy's solution. select() defines your projection when you've
> specified entire query
>
> On 25 December 2015 at 15:42, Василец Дмитрий <pronix.serv...@gmail.com>
> wrote:
>
>> hello
>> you can try to use df.limit(5).show()
>> just trick :)
>>
>> On Fri, Dec 25, 2015 at 2:34 PM, Eugene Morozov <
>> evgeny.a.moro...@gmail.com> wrote:
>>
>>> Hello, I'm basically stuck as I have no idea where to look;
>>>
>>> Following simple code, given that my Datasource is working gives me an
>>> exception.
>>>
>>> DataFrame df = sqlc.load(filename, "com.epam.parso.spark.ds.DefaultSource");
>>> df.cache();
>>> df.printSchema();       <-- prints the schema perfectly fine!
>>>
>>> df.show();                      <-- Works perfectly fine (shows table with 
>>> 20 lines)!
>>> df.registerTempTable("table");
>>> df.select("select * from table limit 5").show(); <-- gives weird exception
>>>
>>> Exception is:
>>>
>>> AnalysisException: cannot resolve 'select * from table limit 5' given input 
>>> columns VER, CREATED, SOC, SOCC, HLTC, HLGTC, STATUS
>>>
>>> I can do a collect on a dataframe, but cannot select any specific
>>> columns either "select * from table" or "select VER, CREATED from table".
>>>
>>> I use spark 1.5.2.
>>> The same code perfectly works through Zeppelin 0.5.5.
>>>
>>> Thanks.
>>> --
>>> Be well!
>>> Jean Morozov
>>>
>>
>>
>

Reply via email to