hql and sql are just two different dialects for interacting with data.
 After parsing is complete and the logical plan is constructed, the
execution is exactly the same.


On Tue, Jul 15, 2014 at 2:50 PM, Jerry Lam <chiling...@gmail.com> wrote:

> Hi Michael,
>
> I don't understand the difference between hql (HiveContext) and sql
> (SQLContext). My previous understanding was that hql is hive specific.
> Unless the table is managed by Hive, we should use sql. For instance, RDD
> (hdfsRDD) created from files in HDFS and registered as a table should use
> sql.
>
> However, my current understanding after trying your suggestion above is
> that I can also query the hdfsRDD using hql via LocalHiveContext. I just
> tested it, the lateral view explode(schools) works with the hdfsRDD.
>
> It seems to me that the HiveContext and SQLContext is the same except that
> HiveContext needs a metastore and it has a more powerful SQL support
> borrowed from Hive. Can you shed some lights on this when you get a minute?
>
> Thanks,
>
> Jerry
>
>
>
>
>
> On Tue, Jul 15, 2014 at 4:32 PM, Michael Armbrust <mich...@databricks.com>
> wrote:
>
>> No, that is why I included the link to SPARK-2096
>> <https://issues.apache.org/jira/browse/SPARK-2096> as well.  You'll need
>> to use HiveQL at this time.
>>
>> Is it possible or planed to support the "schools.time" format to filter
>>>> the
>>>> record that there is an element inside array of schools satisfy time >
>>>> 2?
>>>>
>>>
>> It would be great to support something like this, but its going to take a
>> while to hammer out the correct semantics as SQL does not in general have
>> great support for nested structures.  I think different people might
>> interpret that query to mean there is SOME school.time >2 vs. ALL
>> school.time > 2, etc.
>>
>> You can get what you want now using a lateral view:
>>
>> hql("SELECT DISTINCT name FROM people LATERAL VIEW explode(schools) s as
>> school WHERE school.time > 2")
>>
>
>

Reply via email to