Re: Two newbie question about Iceberg
Great, thanks for working on this, Saisai! On Thu, Aug 8, 2019 at 7:38 PM Saisai Shao wrote: > I'm still looking into this, to figure out a way to add HIVE_LOCKS table > in the Spark side. Anyway I will create an issue first to track this. > > Best regards, > Saisai > > Ryan Blue 于2019年8月9日周五 上午4:58写道: > >> Any ideas on how to fix this? Can we create the HIVE_LOCKS table if it is >> missing automatically? >> >> On Wed, Aug 7, 2019 at 7:13 PM Saisai Shao >> wrote: >> >>> Thanks guys for your reply. >>> >>> I didn't do anything special, I don't even have a configured Hive. I >>> just simply put the iceberg (assembly) jar into Spark and start a local >>> Spark process. I think the built-in Hive version of Spark is 1.2.1-spark >>> (has a slight pom change), and all the configurations related to >>> SparkSQL/Hive are default. I guess the reason is like Anton mentioned, I >>> will take a try by creating all tables (HIVE_LOCKS) using script. But I >>> think we should fix it, this potentially stops user to do a quick start by >>> using local spark. >>> >>> think the reason why it works in tests is because we create all tables (including HIVE_LOCKS) using a script >>> >>> Best regards, >>> Saisai >>> >>> Anton Okolnychyi 于2019年8月7日周三 下午11:56写道: >>> I think the reason why it works in tests is because we create all tables (including HIVE_LOCKS) using a script. I am not sure lock tables are always created in embedded mode. > On 7 Aug 2019, at 16:49, Ryan Blue wrote: > > This is the right list. Iceberg is fairly low in the stack, so most questions are probably dev questions. > > I'm surprised that this doesn't work with an embedded metastore because we use an embedded metastore in tests: https://github.com/apache/incubator-iceberg/blob/master/hive/src/test/java/org/apache/iceberg/hive/TestHiveMetastore.java > > But we are also using Hive 1.2.1 and a metastore schema for 3.1.0. I wonder if a newer version of Hive would avoid this problem? What version are you linking with? > > On Tue, Aug 6, 2019 at 8:59 PM Saisai Shao wrote: > Hi team, > > I just met some issues when trying Iceberg with quick start guide. Not sure if it is proper to send this to @dev mail list (seems there's no user mail list). > > One issue is that seems current Iceberg cannot run with embedded metastore. It will throw an exception. Is this an on-purpose behavior (force to use remote HMS), or just a bug? > > Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable to update transaction database java.sql.SQLSyntaxErrorException: Table/View 'HIVE_LOCKS' does not exist. > at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) > at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source) > > Followed by this issue, seems like current Iceberg only binds to HMS as catalog, this is fine for production usage. But I'm wondering if we could have a simple catalog like in-memory catalog as Spark, so that it is easy for user to test and play. Is there any concern or plan? > > Best regards, > Saisai > > > > > -- > Ryan Blue > Software Engineer > Netflix >> >> -- >> Ryan Blue >> Software Engineer >> Netflix >> > -- Ryan Blue Software Engineer Netflix
Re: Two newbie question about Iceberg
I'm still looking into this, to figure out a way to add HIVE_LOCKS table in the Spark side. Anyway I will create an issue first to track this. Best regards, Saisai Ryan Blue 于2019年8月9日周五 上午4:58写道: > Any ideas on how to fix this? Can we create the HIVE_LOCKS table if it is > missing automatically? > > On Wed, Aug 7, 2019 at 7:13 PM Saisai Shao wrote: > >> Thanks guys for your reply. >> >> I didn't do anything special, I don't even have a configured Hive. I just >> simply put the iceberg (assembly) jar into Spark and start a local Spark >> process. I think the built-in Hive version of Spark is 1.2.1-spark (has a >> slight pom change), and all the configurations related to SparkSQL/Hive are >> default. I guess the reason is like Anton mentioned, I will take a try by >> creating all tables (HIVE_LOCKS) using script. But I think we should fix >> it, this potentially stops user to do a quick start by using local spark. >> >> think the reason why it works in tests is because we create all tables >>> (including HIVE_LOCKS) using a script >>> >> >> Best regards, >> Saisai >> >> Anton Okolnychyi 于2019年8月7日周三 下午11:56写道: >> >>> I think the reason why it works in tests is because we create all tables >>> (including HIVE_LOCKS) using a script. I am not sure lock tables are always >>> created in embedded mode. >>> >>> > On 7 Aug 2019, at 16:49, Ryan Blue wrote: >>> > >>> > This is the right list. Iceberg is fairly low in the stack, so most >>> questions are probably dev questions. >>> > >>> > I'm surprised that this doesn't work with an embedded metastore >>> because we use an embedded metastore in tests: >>> https://github.com/apache/incubator-iceberg/blob/master/hive/src/test/java/org/apache/iceberg/hive/TestHiveMetastore.java >>> > >>> > But we are also using Hive 1.2.1 and a metastore schema for 3.1.0. I >>> wonder if a newer version of Hive would avoid this problem? What version >>> are you linking with? >>> > >>> > On Tue, Aug 6, 2019 at 8:59 PM Saisai Shao >>> wrote: >>> > Hi team, >>> > >>> > I just met some issues when trying Iceberg with quick start guide. Not >>> sure if it is proper to send this to @dev mail list (seems there's no user >>> mail list). >>> > >>> > One issue is that seems current Iceberg cannot run with embedded >>> metastore. It will throw an exception. Is this an on-purpose behavior >>> (force to use remote HMS), or just a bug? >>> > >>> > Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable >>> to update transaction database java.sql.SQLSyntaxErrorException: Table/View >>> 'HIVE_LOCKS' does not exist. >>> > at >>> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown >>> Source) >>> > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown >>> Source) >>> > at >>> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown >>> Source) >>> > >>> > Followed by this issue, seems like current Iceberg only binds to HMS >>> as catalog, this is fine for production usage. But I'm wondering if we >>> could have a simple catalog like in-memory catalog as Spark, so that it is >>> easy for user to test and play. Is there any concern or plan? >>> > >>> > Best regards, >>> > Saisai >>> > >>> > >>> > >>> > >>> > -- >>> > Ryan Blue >>> > Software Engineer >>> > Netflix >>> >>> > > -- > Ryan Blue > Software Engineer > Netflix >
Re: Two newbie question about Iceberg
Any ideas on how to fix this? Can we create the HIVE_LOCKS table if it is missing automatically? On Wed, Aug 7, 2019 at 7:13 PM Saisai Shao wrote: > Thanks guys for your reply. > > I didn't do anything special, I don't even have a configured Hive. I just > simply put the iceberg (assembly) jar into Spark and start a local Spark > process. I think the built-in Hive version of Spark is 1.2.1-spark (has a > slight pom change), and all the configurations related to SparkSQL/Hive are > default. I guess the reason is like Anton mentioned, I will take a try by > creating all tables (HIVE_LOCKS) using script. But I think we should fix > it, this potentially stops user to do a quick start by using local spark. > > think the reason why it works in tests is because we create all tables >> (including HIVE_LOCKS) using a script >> > > Best regards, > Saisai > > Anton Okolnychyi 于2019年8月7日周三 下午11:56写道: > >> I think the reason why it works in tests is because we create all tables >> (including HIVE_LOCKS) using a script. I am not sure lock tables are always >> created in embedded mode. >> >> > On 7 Aug 2019, at 16:49, Ryan Blue wrote: >> > >> > This is the right list. Iceberg is fairly low in the stack, so most >> questions are probably dev questions. >> > >> > I'm surprised that this doesn't work with an embedded metastore because >> we use an embedded metastore in tests: >> https://github.com/apache/incubator-iceberg/blob/master/hive/src/test/java/org/apache/iceberg/hive/TestHiveMetastore.java >> > >> > But we are also using Hive 1.2.1 and a metastore schema for 3.1.0. I >> wonder if a newer version of Hive would avoid this problem? What version >> are you linking with? >> > >> > On Tue, Aug 6, 2019 at 8:59 PM Saisai Shao >> wrote: >> > Hi team, >> > >> > I just met some issues when trying Iceberg with quick start guide. Not >> sure if it is proper to send this to @dev mail list (seems there's no user >> mail list). >> > >> > One issue is that seems current Iceberg cannot run with embedded >> metastore. It will throw an exception. Is this an on-purpose behavior >> (force to use remote HMS), or just a bug? >> > >> > Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable >> to update transaction database java.sql.SQLSyntaxErrorException: Table/View >> 'HIVE_LOCKS' does not exist. >> > at >> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown >> Source) >> > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown >> Source) >> > at >> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown >> Source) >> > >> > Followed by this issue, seems like current Iceberg only binds to HMS as >> catalog, this is fine for production usage. But I'm wondering if we could >> have a simple catalog like in-memory catalog as Spark, so that it is easy >> for user to test and play. Is there any concern or plan? >> > >> > Best regards, >> > Saisai >> > >> > >> > >> > >> > -- >> > Ryan Blue >> > Software Engineer >> > Netflix >> >> -- Ryan Blue Software Engineer Netflix
Re: Two newbie question about Iceberg
Thanks guys for your reply. I didn't do anything special, I don't even have a configured Hive. I just simply put the iceberg (assembly) jar into Spark and start a local Spark process. I think the built-in Hive version of Spark is 1.2.1-spark (has a slight pom change), and all the configurations related to SparkSQL/Hive are default. I guess the reason is like Anton mentioned, I will take a try by creating all tables (HIVE_LOCKS) using script. But I think we should fix it, this potentially stops user to do a quick start by using local spark. think the reason why it works in tests is because we create all tables > (including HIVE_LOCKS) using a script > Best regards, Saisai Anton Okolnychyi 于2019年8月7日周三 下午11:56写道: > I think the reason why it works in tests is because we create all tables > (including HIVE_LOCKS) using a script. I am not sure lock tables are always > created in embedded mode. > > > On 7 Aug 2019, at 16:49, Ryan Blue wrote: > > > > This is the right list. Iceberg is fairly low in the stack, so most > questions are probably dev questions. > > > > I'm surprised that this doesn't work with an embedded metastore because > we use an embedded metastore in tests: > https://github.com/apache/incubator-iceberg/blob/master/hive/src/test/java/org/apache/iceberg/hive/TestHiveMetastore.java > > > > But we are also using Hive 1.2.1 and a metastore schema for 3.1.0. I > wonder if a newer version of Hive would avoid this problem? What version > are you linking with? > > > > On Tue, Aug 6, 2019 at 8:59 PM Saisai Shao > wrote: > > Hi team, > > > > I just met some issues when trying Iceberg with quick start guide. Not > sure if it is proper to send this to @dev mail list (seems there's no user > mail list). > > > > One issue is that seems current Iceberg cannot run with embedded > metastore. It will throw an exception. Is this an on-purpose behavior > (force to use remote HMS), or just a bug? > > > > Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable to > update transaction database java.sql.SQLSyntaxErrorException: Table/View > 'HIVE_LOCKS' does not exist. > > at > org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown > Source) > > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) > > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown > Source) > > > > Followed by this issue, seems like current Iceberg only binds to HMS as > catalog, this is fine for production usage. But I'm wondering if we could > have a simple catalog like in-memory catalog as Spark, so that it is easy > for user to test and play. Is there any concern or plan? > > > > Best regards, > > Saisai > > > > > > > > > > -- > > Ryan Blue > > Software Engineer > > Netflix > >
Re: Two newbie question about Iceberg
I think the reason why it works in tests is because we create all tables (including HIVE_LOCKS) using a script. I am not sure lock tables are always created in embedded mode. > On 7 Aug 2019, at 16:49, Ryan Blue wrote: > > This is the right list. Iceberg is fairly low in the stack, so most questions > are probably dev questions. > > I'm surprised that this doesn't work with an embedded metastore because we > use an embedded metastore in tests: > https://github.com/apache/incubator-iceberg/blob/master/hive/src/test/java/org/apache/iceberg/hive/TestHiveMetastore.java > > But we are also using Hive 1.2.1 and a metastore schema for 3.1.0. I wonder > if a newer version of Hive would avoid this problem? What version are you > linking with? > > On Tue, Aug 6, 2019 at 8:59 PM Saisai Shao wrote: > Hi team, > > I just met some issues when trying Iceberg with quick start guide. Not sure > if it is proper to send this to @dev mail list (seems there's no user mail > list). > > One issue is that seems current Iceberg cannot run with embedded metastore. > It will throw an exception. Is this an on-purpose behavior (force to use > remote HMS), or just a bug? > > Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable to > update transaction database java.sql.SQLSyntaxErrorException: Table/View > 'HIVE_LOCKS' does not exist. > at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown > Source) > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown > Source) > > Followed by this issue, seems like current Iceberg only binds to HMS as > catalog, this is fine for production usage. But I'm wondering if we could > have a simple catalog like in-memory catalog as Spark, so that it is easy for > user to test and play. Is there any concern or plan? > > Best regards, > Saisai > > > > > -- > Ryan Blue > Software Engineer > Netflix
Re: Two newbie question about Iceberg
This is the right list. Iceberg is fairly low in the stack, so most questions are probably dev questions. I'm surprised that this doesn't work with an embedded metastore because we use an embedded metastore in tests: https://github.com/apache/incubator-iceberg/blob/master/hive/src/test/java/org/apache/iceberg/hive/TestHiveMetastore.java But we are also using Hive 1.2.1 and a metastore schema for 3.1.0. I wonder if a newer version of Hive would avoid this problem? What version are you linking with? On Tue, Aug 6, 2019 at 8:59 PM Saisai Shao wrote: > Hi team, > > I just met some issues when trying Iceberg with quick start guide. Not > sure if it is proper to send this to @dev mail list (seems there's no user > mail list). > > One issue is that seems current Iceberg cannot run with embedded > metastore. It will throw an exception. Is this an on-purpose behavior > (force to use remote HMS), or just a bug? > > Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable to > update transaction database java.sql.SQLSyntaxErrorException: Table/View > 'HIVE_LOCKS' does not exist. > at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown > Source) > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown > Source) > > Followed by this issue, seems like current Iceberg only binds to HMS as > catalog, this is fine for production usage. But I'm wondering if we could > have a simple catalog like in-memory catalog as Spark, so that it is easy > for user to test and play. Is there any concern or plan? > > Best regards, > Saisai > > > -- Ryan Blue Software Engineer Netflix
