shangxinli commented on pull request #808:
URL: https://github.com/apache/parquet-mr/pull/808#issuecomment-671465375


   
   > But you have other options in the same WriteSupport init function. One of 
them is to use the Hadoop configuration. It is not global anymore, but rather a 
local copy, an object created for the given worker. It still carries the global 
properties, and you can add your crypto properties here; they will be carried 
to the crypto factory.
   
   @ggershinsky, thanks for understanding! As discussed in the 3rd paragraph in 
my last reply, we do have options to use 'Configuration' but we prefer to let 
the column level settings within the schema itself. The reasons are explained 
in the 1st paragraph. Although it is a local copy, the problems I mentioned are 
still there. I tried that solution earlier before coming to this current 
solution.  Using 'extraMetadata' field is slightly better but still using 
schema is better. 
   
   Since we already defined the interface EncryptionPropertiesFactory, I think 
we can let users decide how do they construct it and which channel they get the 
needed information because in reality there could be many other use cases that 
we don't know at this point. They should choose the one that best fits their 
use cases. I don't think we should/can limit to use just one. As you mentioned, 
they can use 'Configuration' and 'extraMetadata' today that are already two. 
People can also use RPC calls to get it sometime etc. 
   
   Having a metadata field at column level in the schema, in my opinion, should 
be supported in Parquet as I mentioned multiple times earlier that many other 
system's schemas like Avro, Spark already have. If we have some column-wise 
configurations, that would be a good place to set it. I think Gabor already has 
a PR for column-wise configuration. 
   
   I think we are clear on the use cases, needs, change themselves. Can you 
share your concern about adding the metadata field? I feel it should be a safe 
change(the earlier comments with Gabor discussed that). Please let me know what 
do you think.  
   
    
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to