MonkeyCanCode opened a new pull request, #176:
URL: https://github.com/apache/polaris/pull/176

   # Description
   
   For `docker-compose-jupyter.yml`, this is the demo for how to use interact 
with Iceberg with Spark when using Polaris as catalog and Jupyter notebook as 
UI. Currently it is setting `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` 
through environment but only on `polaris` container and not `jupyter` 
container. Due to this, this will actually caused failure as that is where the 
writer happened via Spark. Also, the `jupyter` container is implicitly setting 
`AWS_REGION` to `us-west-2` within the code (and outside too via hard-coded 
environment`), we should keep this set in one place instead.
   
   Fix https://github.com/apache/polaris/issues/144
   
   ## Type of change
   
   Please delete options that are not relevant.
   
   - [x] Bug fix (non-breaking change which fixes an issue)
   - [ ] New feature (non-breaking change which adds functionality)
   - [ ] Breaking change (fix or feature that would cause existing 
functionality to not work as expected)
   - [ ] This change requires a documentation update
   
   # How Has This Been Tested?
   
   This had being tested locally with following steps:
   Unhappy path (current code):
   ```
   # create .env file for `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`
   # build and start containers
   docker compose -f docker-compose-jupyter.yml up
   # get auth info from logs (as default is in-memory)
   docker logs xxxx | grep xxxx
   # update jupyter notebook for polaris's auth info as well as AWS ARN and S3 
path
   # run through the notebook
   # failed on create table due to no access to the s3 bucket
   ```
   
   Happy path (fixed code):
   ```
   # create .env file for `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`
   # build and start containers
   docker compose -f docker-compose-jupyter.yml up
   # get auth info from logs (as default is in-memory)
   docker logs xxxx | grep xxxx
   # update jupyter notebook for polaris's auth info as well as AWS ARN and S3 
path
   # run through the notebook
   # all cells completed (there are couple cells there are expected to fail due 
to access on role/catalog...but those are expected thus consider as completed)
   ```
   
   Reasoning: This is because the AWS ARN is just a role and we still still 
need an user to assume that role. 
   
   # Checklist:
   
   Please delete options that are not relevant.
   
   - [x] I have performed a self-review of my code
   - [ ] I have commented my code, particularly in hard-to-understand areas
   - [ ] I have made corresponding changes to the documentation
   - [x] My changes generate no new warnings
   - [ ] I have added tests that prove my fix is effective or that my feature 
works
   - [x] New and existing unit tests pass locally with my changes
   - [ ] Any dependent changes have been merged and published in downstream 
modules
   - [ ] If adding new functionality, I have discussed my implementation with 
the community using the linked GitHub issue
   - [x] I have signed and submitted the 
[ICLA](https://github.com/polaris-catalog/polaris/blob/main/ICLA.md) and if 
needed, the 
[CCLA](https://github.com/polaris-catalog/polaris/blob/main/CCLA.md). See 
[Contributing](https://github.com/polaris-catalog/polaris/blob/main/CONTRIBUTING.md)
 for details. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@polaris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to