garawalid edited a comment on pull request #1100:
URL: https://github.com/apache/iceberg/pull/1100#issuecomment-640137500


   @Jiayi-Liao thanks for the PR.
   
   I think one dockerfile with standalone spark, hive and hdfs will be better. 
We actually don't need to have a name node and data node. After all, the docker 
will be used for a quick demo. 
   And for the hdfs we can think of mount the container with the host, so data 
can be available from the container.
   
   
   > **(1) Should we use Hive Table or Hadoop Table in the docker demo?**
   > 
   > Currently I'm using the Hadoop Table and Spark to demonstrate the writing 
process. It'd be fine if you think a Hive Table would be better because of its 
popularity. I guess it'd be easy to add one more hive container here.
   
   +1 to add Hive, so we can test it all Iceberg features.
   
   > **(2) How often do we need to update the docker?**
   > 
   > I think there may exist two choices:
   > 
   > * Update the docker in every Iceberg's building process. In this way, 
users can always try the latest Iceberg built from the master branch, which may 
enable users and developers try new features as soon as possible.
   > * Update the docker in every Iceberg's stable release. It's like what I'm 
doing now, downloading from the stable release from apache's repository. The 
docker will be updated only when Iceberg releases a new version.
   
   DockerHub can trigger the build after each commit made in Iceberg 
repository. But we need to link them both. 
   I think the best way is to download the dockerfile and build the image 
locally with the target Hadoop, Hive, and Spark version.
   
   > **(3) Features included in the docker demo**
   > 
   > The current user guide in 
[README.md](https://github.com/apache/iceberg/blob/0a92afae5263ccbd608de0efdc646ab479fb9638/docker/README.md)
 is a temporary version for committers to review the codes, and I'm going to 
find a more popular scenario and more data for Iceberg's starters (I appreciate 
any help about this).
   
   I think just the getting start tutorial with iceberg-docker is enough! It's 
up to the user to try Iceberg and build some quick PoC.
    
   > **(4) Will the docker demo demonstrated on the website?**  
   
   A getting start with docker would be nice!
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to