Re: pio app new failed in hbase
Thank you very much for your explanation. It all makes sense of course, i guess as soon as i'll setup a cluster everything will be better (and more manageable/predictable). Messing with hadoop, hdfs and hbase all in the same machine seems not the way to go even in developement stage. I'm also using the Universal Recommender precisely because it does not require Spark for anything but training, it's a major benefit and indeed perfect for a cluster setup. 2018-05-29 20:01 GMT+02:00 Pat Ferrel : > No, this is as expected. When you run pseudo-distributed everything > internally is configured as if the services were on separate machines. See > clustered instructions here: http://actionml.com/docs/small_ha_cluster This > is to setup 3 machines running different parts and is not really the best > physical architecture but does illustrate how a distributed setup would go. > > BTW we (ActionML) use containers now to do this setup but it still works. > The smallest distributed cluster that makes sense for the Universal > Recommender is 5 machines. 2 dedicated to Spark, which can be started and > stopped around the `pio train` process. So 3 are permanent; one for PIO > servers (EventServer and PredictionServer) one for HDFS+HBase, one for > Elasticsearch. This allows you to vertically scale by increasing the size > of the service instances in-place (easy with AWS), then horizontally scale > HBase or Elasticsearch, or Spark independently if vertical scaling is not > sufficient. You can also combine the 2 Spark instances as long as you > remember that the `pio train` process creates a Spark Driver on the machine > the process is launched on and so the driver may need to be nearly as > powerful as a Spark Executor. The Spark Driver is an “invisible" and > therefore often overlooked member of the Spark cluster. It is often but not > always smaller than the executors, to put it on the PIO servers machine is > therefore dangerous in terms of scaling unless you know the resources it > will need. Using Yarn can but the Driver on the cluster (off the launching > machine) but is more complex than the default Spark “standalone” config. > > The Universal Recommender is the exception here because it does not > require a big non-local Spark for anything but training, so we move the > `pio train` process to a Spark “Driver” machine that is ephemeral as the > Spark Executor(s) is(are). Other templates may require Spark in train and > deploy. Once the UR’s training is done it will automatically swap in the > new model so the running deployed PredictionServer will automatically start > using it—no re-deploy needed. > > > From: Marco Goldin > Reply: user@predictionio.apache.org > > Date: May 29, 2018 at 6:38:21 AM > To: user@predictionio.apache.org > > Subject: Re: pio app new failed in hbase > > i was able to solve the issue deleting hbase folder in hdfs with "hdfs dfs > -rm -r /hbase" and restarting hbase. > now app creation in pio is working again. > > I still wonder why this problem happen though, i'm running hbase in > pseudo-distributed mode (for testing purposes everything, from spark to > hadoop, is in a single machine), could a problem for prediction in managing > the apps? > > 2018-05-29 13:47 GMT+02:00 Marco Goldin : > >> Hi all, i deleted all old apps from prediction (currently running 0.12.0) >> but when i'm creating a new one i get this error from hbase. >> I inspected hbase from shell but there aren't any table inside. >> >> >> ``` >> >> pio app new mlolur >> >> [INFO] [HBLEvents] The table pio_event:events_1 doesn't exist yet. >> Creating now... >> >> Exception in thread "main" org.apache.hadoop.hbase.TableExistsException: >> org.apache.hadoop.hbase.TableExistsException: pio_event:events_1 >> >> at org.apache.hadoop.hbase.master.procedure.CreateTableProcedur >> e.prepareCreate(CreateTableProcedure.java:299) >> >> at org.apache.hadoop.hbase.master.procedure.CreateTableProcedur >> e.executeFromState(CreateTableProcedure.java:106) >> >> at org.apache.hadoop.hbase.master.procedure.CreateTableProcedur >> e.executeFromState(CreateTableProcedure.java:58) >> >> at org.apache.hadoop.hbase.procedure2.StateMachineProcedure. >> execute(StateMachineProcedure.java:119) >> >> at org.apache.hadoop.hbase.procedure2.Procedure.doExecute( >> Procedure.java:498) >> >> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execPro >> cedure(ProcedureExecutor.java:1147) >> >> at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoo >> p(ProcedureExecutor.java:942) >> >> at org.apache.
Re: pio app new failed in hbase
No, this is as expected. When you run pseudo-distributed everything internally is configured as if the services were on separate machines. See clustered instructions here: http://actionml.com/docs/small_ha_cluster This is to setup 3 machines running different parts and is not really the best physical architecture but does illustrate how a distributed setup would go. BTW we (ActionML) use containers now to do this setup but it still works. The smallest distributed cluster that makes sense for the Universal Recommender is 5 machines. 2 dedicated to Spark, which can be started and stopped around the `pio train` process. So 3 are permanent; one for PIO servers (EventServer and PredictionServer) one for HDFS+HBase, one for Elasticsearch. This allows you to vertically scale by increasing the size of the service instances in-place (easy with AWS), then horizontally scale HBase or Elasticsearch, or Spark independently if vertical scaling is not sufficient. You can also combine the 2 Spark instances as long as you remember that the `pio train` process creates a Spark Driver on the machine the process is launched on and so the driver may need to be nearly as powerful as a Spark Executor. The Spark Driver is an “invisible" and therefore often overlooked member of the Spark cluster. It is often but not always smaller than the executors, to put it on the PIO servers machine is therefore dangerous in terms of scaling unless you know the resources it will need. Using Yarn can but the Driver on the cluster (off the launching machine) but is more complex than the default Spark “standalone” config. The Universal Recommender is the exception here because it does not require a big non-local Spark for anything but training, so we move the `pio train` process to a Spark “Driver” machine that is ephemeral as the Spark Executor(s) is(are). Other templates may require Spark in train and deploy. Once the UR’s training is done it will automatically swap in the new model so the running deployed PredictionServer will automatically start using it—no re-deploy needed. From: Marco Goldin Reply: user@predictionio.apache.org Date: May 29, 2018 at 6:38:21 AM To: user@predictionio.apache.org Subject: Re: pio app new failed in hbase i was able to solve the issue deleting hbase folder in hdfs with "hdfs dfs -rm -r /hbase" and restarting hbase. now app creation in pio is working again. I still wonder why this problem happen though, i'm running hbase in pseudo-distributed mode (for testing purposes everything, from spark to hadoop, is in a single machine), could a problem for prediction in managing the apps? 2018-05-29 13:47 GMT+02:00 Marco Goldin : > Hi all, i deleted all old apps from prediction (currently running 0.12.0) > but when i'm creating a new one i get this error from hbase. > I inspected hbase from shell but there aren't any table inside. > > > ``` > > pio app new mlolur > > [INFO] [HBLEvents] The table pio_event:events_1 doesn't exist yet. > Creating now... > > Exception in thread "main" org.apache.hadoop.hbase.TableExistsException: > org.apache.hadoop.hbase.TableExistsException: pio_event:events_1 > > at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure. > prepareCreate(CreateTableProcedure.java:299) > > at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure. > executeFromState(CreateTableProcedure.java:106) > > at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure. > executeFromState(CreateTableProcedure.java:58) > > at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute( > StateMachineProcedure.java:119) > > at org.apache.hadoop.hbase.procedure2.Procedure. > doExecute(Procedure.java:498) > > at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure( > ProcedureExecutor.java:1147) > > at org.apache.hadoop.hbase.procedure2.ProcedureExecutor. > execLoop(ProcedureExecutor.java:942) > > at org.apache.hadoop.hbase.procedure2.ProcedureExecutor. > execLoop(ProcedureExecutor.java:895) > > at org.apache.hadoop.hbase.procedure2.ProcedureExecutor. > access$400(ProcedureExecutor.java:77) > > at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$ > 2.run(ProcedureExecutor.java:497) > > > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > > at sun.reflect.NativeConstructorAccessorImpl.newInstance( > NativeConstructorAccessorImpl.java:62) > > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance( > DelegatingConstructorAccessorImpl.java:45) > > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > > at org.apache.hadoop.ipc.RemoteException.instantiateException( > RemoteException.java:106) > > at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException( > RemoteException
Re: pio app new failed in hbase
i was able to solve the issue deleting hbase folder in hdfs with "hdfs dfs -rm -r /hbase" and restarting hbase. now app creation in pio is working again. I still wonder why this problem happen though, i'm running hbase in pseudo-distributed mode (for testing purposes everything, from spark to hadoop, is in a single machine), could a problem for prediction in managing the apps? 2018-05-29 13:47 GMT+02:00 Marco Goldin : > Hi all, i deleted all old apps from prediction (currently running 0.12.0) > but when i'm creating a new one i get this error from hbase. > I inspected hbase from shell but there aren't any table inside. > > > ``` > > pio app new mlolur > > [INFO] [HBLEvents] The table pio_event:events_1 doesn't exist yet. > Creating now... > > Exception in thread "main" org.apache.hadoop.hbase.TableExistsException: > org.apache.hadoop.hbase.TableExistsException: pio_event:events_1 > > at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure. > prepareCreate(CreateTableProcedure.java:299) > > at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure. > executeFromState(CreateTableProcedure.java:106) > > at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure. > executeFromState(CreateTableProcedure.java:58) > > at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute( > StateMachineProcedure.java:119) > > at org.apache.hadoop.hbase.procedure2.Procedure. > doExecute(Procedure.java:498) > > at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure( > ProcedureExecutor.java:1147) > > at org.apache.hadoop.hbase.procedure2.ProcedureExecutor. > execLoop(ProcedureExecutor.java:942) > > at org.apache.hadoop.hbase.procedure2.ProcedureExecutor. > execLoop(ProcedureExecutor.java:895) > > at org.apache.hadoop.hbase.procedure2.ProcedureExecutor. > access$400(ProcedureExecutor.java:77) > > at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$ > 2.run(ProcedureExecutor.java:497) > > > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > > at sun.reflect.NativeConstructorAccessorImpl.newInstance( > NativeConstructorAccessorImpl.java:62) > > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance( > DelegatingConstructorAccessorImpl.java:45) > > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > > at org.apache.hadoop.ipc.RemoteException.instantiateException( > RemoteException.java:106) > > at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException( > RemoteException.java:95) > > at org.apache.hadoop.hbase.client.RpcRetryingCaller.translateException( > RpcRetryingCaller.java:209) > > at org.apache.hadoop.hbase.client.RpcRetryingCaller.translateException( > RpcRetryingCaller.java:223) > > at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries( > RpcRetryingCaller.java:121) > > at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries( > RpcRetryingCaller.java:90) > > at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin. > java:3347) > > at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsync(HBaseAdmin. > java:603) > > at org.apache.hadoop.hbase.client.HBaseAdmin.createTable( > HBaseAdmin.java:494) > > at org.apache.hadoop.hbase.client.HBaseAdmin.createTable( > HBaseAdmin.java:428) > > at org.apache.predictionio.data.storage.hbase.HBLEvents.init( > HBLEvents.scala:66) > > at org.apache.predictionio.tools.commands.App$$anonfun$create$ > 4$$anonfun$apply$5.apply(App.scala:63) > > at org.apache.predictionio.tools.commands.App$$anonfun$create$ > 4$$anonfun$apply$5.apply(App.scala:62) > > at scala.Option.map(Option.scala:146) > > at org.apache.predictionio.tools.commands.App$$anonfun$create$ > 4.apply(App.scala:62) > > at org.apache.predictionio.tools.commands.App$$anonfun$create$ > 4.apply(App.scala:55) > > at scala.Option.getOrElse(Option.scala:121) > > at org.apache.predictionio.tools.commands.App$.create(App.scala:55) > > at org.apache.predictionio.tools.console.Pio$App$.create(Pio.scala:173) > > at org.apache.predictionio.tools.console.Console$$anonfun$main$ > 1.apply(Console.scala:726) > > at org.apache.predictionio.tools.console.Console$$anonfun$main$ > 1.apply(Console.scala:656) > > at scala.Option.map(Option.scala:146) > > at org.apache.predictionio.tools.console.Console$.main(Console.scala:656) > > at org.apache.predictionio.tools.console.Console.main(Console.scala) > > Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org. > apache.hadoop.hbase.TableExistsException): > org.apache.hadoop.hbase.TableExistsException: > pio_event:events_1 > > at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure. > prepareCreate(CreateTableProcedure.java:299) > > at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure. > executeFromState(CreateTableProcedure.java:106) > > at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure. > executeFromState(CreateTableProcedure.java:58) > > at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute( >