Re: Two newbie question about Iceberg

2019-08-12 Thread Ryan Blue
Great, thanks for working on this, Saisai!

On Thu, Aug 8, 2019 at 7:38 PM Saisai Shao  wrote:

> I'm still looking into this, to figure out a way to add HIVE_LOCKS table
> in the Spark side. Anyway I will create an issue first to track this.
>
> Best regards,
> Saisai
>
> Ryan Blue  于2019年8月9日周五 上午4:58写道:
>
>> Any ideas on how to fix this? Can we create the HIVE_LOCKS table if it is
>> missing automatically?
>>
>> On Wed, Aug 7, 2019 at 7:13 PM Saisai Shao 
>> wrote:
>>
>>> Thanks guys for your reply.
>>>
>>> I didn't do anything special, I don't even have a configured Hive. I
>>> just simply put the iceberg (assembly) jar into Spark and start a local
>>> Spark process. I think the built-in Hive version of Spark is 1.2.1-spark
>>> (has a slight pom change), and all the configurations related to
>>> SparkSQL/Hive are default. I guess the reason is like Anton mentioned, I
>>> will take a try by creating all tables (HIVE_LOCKS) using script. But I
>>> think we should fix it, this potentially stops user to do a quick start by
>>> using local spark.
>>>
>>>  think the reason why it works in tests is because we create all tables
 (including HIVE_LOCKS) using a script

>>>
>>> Best regards,
>>> Saisai
>>>
>>> Anton Okolnychyi  于2019年8月7日周三 下午11:56写道:
>>>
 I think the reason why it works in tests is because we create all
 tables (including HIVE_LOCKS) using a script. I am not sure lock tables are
 always created in embedded mode.

 > On 7 Aug 2019, at 16:49, Ryan Blue  wrote:
 >
 > This is the right list. Iceberg is fairly low in the stack, so most
 questions are probably dev questions.
 >
 > I'm surprised that this doesn't work with an embedded metastore
 because we use an embedded metastore in tests:
 https://github.com/apache/incubator-iceberg/blob/master/hive/src/test/java/org/apache/iceberg/hive/TestHiveMetastore.java
 >
 > But we are also using Hive 1.2.1 and a metastore schema for 3.1.0. I
 wonder if a newer version of Hive would avoid this problem? What version
 are you linking with?
 >
 > On Tue, Aug 6, 2019 at 8:59 PM Saisai Shao 
 wrote:
 > Hi team,
 >
 > I just met some issues when trying Iceberg with quick start guide.
 Not sure if it is proper to send this to @dev mail list (seems there's no
 user mail list).
 >
 > One issue is that seems current Iceberg cannot run with embedded
 metastore. It will throw an exception. Is this an on-purpose behavior
 (force to use remote HMS), or just a bug?
 >
 > Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable
 to update transaction database java.sql.SQLSyntaxErrorException: Table/View
 'HIVE_LOCKS' does not exist.
 > at
 org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown
 Source)
 > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown
 Source)
 > at
 org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown
 Source)
 >
 > Followed by this issue, seems like current Iceberg only binds to HMS
 as catalog, this is fine for production usage. But I'm wondering if we
 could have a simple catalog like in-memory catalog as Spark, so that it is
 easy for user to test and play. Is there any concern or plan?
 >
 > Best regards,
 > Saisai
 >
 >
 >
 >
 > --
 > Ryan Blue
 > Software Engineer
 > Netflix


>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>

-- 
Ryan Blue
Software Engineer
Netflix


Re: Two newbie question about Iceberg

2019-08-08 Thread Saisai Shao
I'm still looking into this, to figure out a way to add HIVE_LOCKS table in
the Spark side. Anyway I will create an issue first to track this.

Best regards,
Saisai

Ryan Blue  于2019年8月9日周五 上午4:58写道:

> Any ideas on how to fix this? Can we create the HIVE_LOCKS table if it is
> missing automatically?
>
> On Wed, Aug 7, 2019 at 7:13 PM Saisai Shao  wrote:
>
>> Thanks guys for your reply.
>>
>> I didn't do anything special, I don't even have a configured Hive. I just
>> simply put the iceberg (assembly) jar into Spark and start a local Spark
>> process. I think the built-in Hive version of Spark is 1.2.1-spark (has a
>> slight pom change), and all the configurations related to SparkSQL/Hive are
>> default. I guess the reason is like Anton mentioned, I will take a try by
>> creating all tables (HIVE_LOCKS) using script. But I think we should fix
>> it, this potentially stops user to do a quick start by using local spark.
>>
>>  think the reason why it works in tests is because we create all tables
>>> (including HIVE_LOCKS) using a script
>>>
>>
>> Best regards,
>> Saisai
>>
>> Anton Okolnychyi  于2019年8月7日周三 下午11:56写道:
>>
>>> I think the reason why it works in tests is because we create all tables
>>> (including HIVE_LOCKS) using a script. I am not sure lock tables are always
>>> created in embedded mode.
>>>
>>> > On 7 Aug 2019, at 16:49, Ryan Blue  wrote:
>>> >
>>> > This is the right list. Iceberg is fairly low in the stack, so most
>>> questions are probably dev questions.
>>> >
>>> > I'm surprised that this doesn't work with an embedded metastore
>>> because we use an embedded metastore in tests:
>>> https://github.com/apache/incubator-iceberg/blob/master/hive/src/test/java/org/apache/iceberg/hive/TestHiveMetastore.java
>>> >
>>> > But we are also using Hive 1.2.1 and a metastore schema for 3.1.0. I
>>> wonder if a newer version of Hive would avoid this problem? What version
>>> are you linking with?
>>> >
>>> > On Tue, Aug 6, 2019 at 8:59 PM Saisai Shao 
>>> wrote:
>>> > Hi team,
>>> >
>>> > I just met some issues when trying Iceberg with quick start guide. Not
>>> sure if it is proper to send this to @dev mail list (seems there's no user
>>> mail list).
>>> >
>>> > One issue is that seems current Iceberg cannot run with embedded
>>> metastore. It will throw an exception. Is this an on-purpose behavior
>>> (force to use remote HMS), or just a bug?
>>> >
>>> > Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable
>>> to update transaction database java.sql.SQLSyntaxErrorException: Table/View
>>> 'HIVE_LOCKS' does not exist.
>>> > at
>>> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown
>>> Source)
>>> > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown
>>> Source)
>>> > at
>>> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown
>>> Source)
>>> >
>>> > Followed by this issue, seems like current Iceberg only binds to HMS
>>> as catalog, this is fine for production usage. But I'm wondering if we
>>> could have a simple catalog like in-memory catalog as Spark, so that it is
>>> easy for user to test and play. Is there any concern or plan?
>>> >
>>> > Best regards,
>>> > Saisai
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Ryan Blue
>>> > Software Engineer
>>> > Netflix
>>>
>>>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>


Re: Two newbie question about Iceberg

2019-08-08 Thread Ryan Blue
Any ideas on how to fix this? Can we create the HIVE_LOCKS table if it is
missing automatically?

On Wed, Aug 7, 2019 at 7:13 PM Saisai Shao  wrote:

> Thanks guys for your reply.
>
> I didn't do anything special, I don't even have a configured Hive. I just
> simply put the iceberg (assembly) jar into Spark and start a local Spark
> process. I think the built-in Hive version of Spark is 1.2.1-spark (has a
> slight pom change), and all the configurations related to SparkSQL/Hive are
> default. I guess the reason is like Anton mentioned, I will take a try by
> creating all tables (HIVE_LOCKS) using script. But I think we should fix
> it, this potentially stops user to do a quick start by using local spark.
>
>  think the reason why it works in tests is because we create all tables
>> (including HIVE_LOCKS) using a script
>>
>
> Best regards,
> Saisai
>
> Anton Okolnychyi  于2019年8月7日周三 下午11:56写道:
>
>> I think the reason why it works in tests is because we create all tables
>> (including HIVE_LOCKS) using a script. I am not sure lock tables are always
>> created in embedded mode.
>>
>> > On 7 Aug 2019, at 16:49, Ryan Blue  wrote:
>> >
>> > This is the right list. Iceberg is fairly low in the stack, so most
>> questions are probably dev questions.
>> >
>> > I'm surprised that this doesn't work with an embedded metastore because
>> we use an embedded metastore in tests:
>> https://github.com/apache/incubator-iceberg/blob/master/hive/src/test/java/org/apache/iceberg/hive/TestHiveMetastore.java
>> >
>> > But we are also using Hive 1.2.1 and a metastore schema for 3.1.0. I
>> wonder if a newer version of Hive would avoid this problem? What version
>> are you linking with?
>> >
>> > On Tue, Aug 6, 2019 at 8:59 PM Saisai Shao 
>> wrote:
>> > Hi team,
>> >
>> > I just met some issues when trying Iceberg with quick start guide. Not
>> sure if it is proper to send this to @dev mail list (seems there's no user
>> mail list).
>> >
>> > One issue is that seems current Iceberg cannot run with embedded
>> metastore. It will throw an exception. Is this an on-purpose behavior
>> (force to use remote HMS), or just a bug?
>> >
>> > Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable
>> to update transaction database java.sql.SQLSyntaxErrorException: Table/View
>> 'HIVE_LOCKS' does not exist.
>> > at
>> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown
>> Source)
>> > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown
>> Source)
>> > at
>> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown
>> Source)
>> >
>> > Followed by this issue, seems like current Iceberg only binds to HMS as
>> catalog, this is fine for production usage. But I'm wondering if we could
>> have a simple catalog like in-memory catalog as Spark, so that it is easy
>> for user to test and play. Is there any concern or plan?
>> >
>> > Best regards,
>> > Saisai
>> >
>> >
>> >
>> >
>> > --
>> > Ryan Blue
>> > Software Engineer
>> > Netflix
>>
>>

-- 
Ryan Blue
Software Engineer
Netflix


Re: Two newbie question about Iceberg

2019-08-07 Thread Saisai Shao
Thanks guys for your reply.

I didn't do anything special, I don't even have a configured Hive. I just
simply put the iceberg (assembly) jar into Spark and start a local Spark
process. I think the built-in Hive version of Spark is 1.2.1-spark (has a
slight pom change), and all the configurations related to SparkSQL/Hive are
default. I guess the reason is like Anton mentioned, I will take a try by
creating all tables (HIVE_LOCKS) using script. But I think we should fix
it, this potentially stops user to do a quick start by using local spark.

 think the reason why it works in tests is because we create all tables
> (including HIVE_LOCKS) using a script
>

Best regards,
Saisai

Anton Okolnychyi  于2019年8月7日周三 下午11:56写道:

> I think the reason why it works in tests is because we create all tables
> (including HIVE_LOCKS) using a script. I am not sure lock tables are always
> created in embedded mode.
>
> > On 7 Aug 2019, at 16:49, Ryan Blue  wrote:
> >
> > This is the right list. Iceberg is fairly low in the stack, so most
> questions are probably dev questions.
> >
> > I'm surprised that this doesn't work with an embedded metastore because
> we use an embedded metastore in tests:
> https://github.com/apache/incubator-iceberg/blob/master/hive/src/test/java/org/apache/iceberg/hive/TestHiveMetastore.java
> >
> > But we are also using Hive 1.2.1 and a metastore schema for 3.1.0. I
> wonder if a newer version of Hive would avoid this problem? What version
> are you linking with?
> >
> > On Tue, Aug 6, 2019 at 8:59 PM Saisai Shao 
> wrote:
> > Hi team,
> >
> > I just met some issues when trying Iceberg with quick start guide. Not
> sure if it is proper to send this to @dev mail list (seems there's no user
> mail list).
> >
> > One issue is that seems current Iceberg cannot run with embedded
> metastore. It will throw an exception. Is this an on-purpose behavior
> (force to use remote HMS), or just a bug?
> >
> > Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable to
> update transaction database java.sql.SQLSyntaxErrorException: Table/View
> 'HIVE_LOCKS' does not exist.
> > at
> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown
> Source)
> > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)
> > at
> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown
> Source)
> >
> > Followed by this issue, seems like current Iceberg only binds to HMS as
> catalog, this is fine for production usage. But I'm wondering if we could
> have a simple catalog like in-memory catalog as Spark, so that it is easy
> for user to test and play. Is there any concern or plan?
> >
> > Best regards,
> > Saisai
> >
> >
> >
> >
> > --
> > Ryan Blue
> > Software Engineer
> > Netflix
>
>


Re: Two newbie question about Iceberg

2019-08-07 Thread Anton Okolnychyi
I think the reason why it works in tests is because we create all tables 
(including HIVE_LOCKS) using a script. I am not sure lock tables are always 
created in embedded mode.

> On 7 Aug 2019, at 16:49, Ryan Blue  wrote:
> 
> This is the right list. Iceberg is fairly low in the stack, so most questions 
> are probably dev questions.
> 
> I'm surprised that this doesn't work with an embedded metastore because we 
> use an embedded metastore in tests: 
> https://github.com/apache/incubator-iceberg/blob/master/hive/src/test/java/org/apache/iceberg/hive/TestHiveMetastore.java
> 
> But we are also using Hive 1.2.1 and a metastore schema for 3.1.0. I wonder 
> if a newer version of Hive would avoid this problem? What version are you 
> linking with?
> 
> On Tue, Aug 6, 2019 at 8:59 PM Saisai Shao  wrote:
> Hi team, 
> 
> I just met some issues when trying Iceberg with quick start guide. Not sure 
> if it is proper to send this to @dev mail list (seems there's no user mail 
> list).
> 
> One issue is that seems current Iceberg cannot run with embedded metastore. 
> It will throw an exception. Is this an on-purpose behavior (force to use 
> remote HMS), or just a bug?
> 
> Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable to 
> update transaction database java.sql.SQLSyntaxErrorException: Table/View 
> 'HIVE_LOCKS' does not exist.
> at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown 
> Source)
> at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)
> at 
> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
> Source)
> 
> Followed by this issue, seems like current Iceberg only binds to HMS as 
> catalog, this is fine for production usage. But I'm wondering if we could 
> have a simple catalog like in-memory catalog as Spark, so that it is easy for 
> user to test and play. Is there any concern or plan?
> 
> Best regards,
> Saisai
> 
> 
> 
> 
> -- 
> Ryan Blue
> Software Engineer
> Netflix



Re: Two newbie question about Iceberg

2019-08-07 Thread Ryan Blue
This is the right list. Iceberg is fairly low in the stack, so most
questions are probably dev questions.

I'm surprised that this doesn't work with an embedded metastore because we
use an embedded metastore in tests:
https://github.com/apache/incubator-iceberg/blob/master/hive/src/test/java/org/apache/iceberg/hive/TestHiveMetastore.java

But we are also using Hive 1.2.1 and a metastore schema for 3.1.0. I wonder
if a newer version of Hive would avoid this problem? What version are you
linking with?

On Tue, Aug 6, 2019 at 8:59 PM Saisai Shao  wrote:

> Hi team,
>
> I just met some issues when trying Iceberg with quick start guide. Not
> sure if it is proper to send this to @dev mail list (seems there's no user
> mail list).
>
> One issue is that seems current Iceberg cannot run with embedded
> metastore. It will throw an exception. Is this an on-purpose behavior
> (force to use remote HMS), or just a bug?
>
> Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable to
> update transaction database java.sql.SQLSyntaxErrorException: Table/View
> 'HIVE_LOCKS' does not exist.
> at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown
> Source)
> at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)
> at
> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown
> Source)
>
> Followed by this issue, seems like current Iceberg only binds to HMS as
> catalog, this is fine for production usage. But I'm wondering if we could
> have a simple catalog like in-memory catalog as Spark, so that it is easy
> for user to test and play. Is there any concern or plan?
>
> Best regards,
> Saisai
>
>
>

-- 
Ryan Blue
Software Engineer
Netflix