I see, thanks for the info! On Mon, Jan 23, 2017 at 4:12 PM, Xiao Li <gatorsm...@gmail.com> wrote:
> Reynold mentioned the direction we are heading. You can see many PRs the > community submitted are for this target. To achieve this, a lot of works we > need to do. > > For example, for some serde, Hive metastore will infer the schema when the > schema is not provided, but our InMemoryCatalog does not have such a > capability. Thus, we need to see how to resolve this. > > Hopefully, it answers your question. BTW, the issue you mentioned at the > beginning has been resolved. Please fetch the latest master. You are unable > to create such a hive serde table without Hive support. > > Thanks, > > Xiao Li > > > 2017-01-23 0:01 GMT-08:00 Shuai Lin <linshuai2...@gmail.com>: > >> Cool, thanks for the info. >> >> I think this is something we are going to change to completely decouple >>> the Hive support and catalog. >> >> >> Is there a ticket for this? I did a search in jira and only found >> "SPARK-16275: Implement all the Hive fallback functions", which seems to be >> related to it. >> >> >> On Mon, Jan 23, 2017 at 3:21 AM, Xiao Li <gatorsm...@gmail.com> wrote: >> >>> Agree. : ) >>> >>> 2017-01-22 11:20 GMT-08:00 Reynold Xin <r...@databricks.com>: >>> >>>> To be clear there are two separate "hive" we are talking about here. >>>> One is the catalog, and the other is the Hive serde and UDF support. We >>>> want to get to a point that the choice of catalog does not impact the >>>> functionality in Spark other than where the catalog is stored. >>>> >>>> >>>> On Sun, Jan 22, 2017 at 11:18 AM Xiao Li <gatorsm...@gmail.com> wrote: >>>> >>>>> We have a pending PR to block users to create the Hive serde table >>>>> when using InMemroyCatalog. See: https://github.com/apache/spar >>>>> k/pull/16587 I believe it answers your question. >>>>> >>>>> BTW, we still can create the regular data source tables and insert the >>>>> data into the tables. The major difference is whether the metadata is >>>>> persistently stored or not. >>>>> >>>>> Thanks, >>>>> >>>>> Xiao Li >>>>> >>>>> 2017-01-22 11:14 GMT-08:00 Reynold Xin <r...@databricks.com>: >>>>> >>>>> I think this is something we are going to change to completely >>>>> decouple the Hive support and catalog. >>>>> >>>>> >>>>> On Sun, Jan 22, 2017 at 4:51 AM Shuai Lin <linshuai2...@gmail.com> >>>>> wrote: >>>>> >>>>> Hi all, >>>>> >>>>> Currently when the in-memory catalog is used, e.g. through `--conf >>>>> spark.sql.catalogImplementation=in-memory`, we can create a >>>>> persistent table, but inserting into this table would fail with error >>>>> message "Hive support is required to insert into the following tables..". >>>>> >>>>> sql("create table t1 (id int, name string, dept string)") // OK >>>>> sql("insert into t1 values (1, 'name1', 'dept1')") // ERROR >>>>> >>>>> >>>>> This doesn't make sense for me, because this table would always be >>>>> empty if we can't insert into it, thus would be of no use. But I wonder if >>>>> there are other good reasons for the current logic. If not, I would >>>>> propose >>>>> to raise an error when creating the table in the first place. >>>>> >>>>> Thanks! >>>>> >>>>> Regards, >>>>> Shuai Lin (@lins05) >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>> >> >