mapleFU commented on issue #15074:
URL: https://github.com/apache/arrow/issues/15074#issuecomment-1363873760

   > Can you point where we use `int16_t`?
   
   Keyword: `int16_t page_ordinal`
   
   `SerializedPageReader` and `SerializedPageWriter` have it.
   
   And the encrption requires page_ordinal to be no more than int16::max, in 
parquet-mr's implemention, `page_ordinal` uses `int` in memory, and function 
will check if it's greater than SHORT.MAX:
   
   ```
   public static byte[] createModuleAAD(byte[] fileAAD, ModuleType moduleType,
         int rowGroupOrdinal, int columnOrdinal, int pageOrdinal) {
   ...
   
       if (pageOrdinal < 0) {
         throw new IllegalArgumentException("Wrong page ordinal: " + 
pageOrdinal);
       }
       short shortPageOrdinal = (short) pageOrdinal;
       if (shortPageOrdinal != pageOrdinal) {
         throw new ParquetCryptoRuntimeException("Encrypted parquet files can't 
have "
             + "more than " + Short.MAX_VALUE + " pages per chunk: " + 
pageOrdinal);
       }
       byte[] pageOrdinalBytes = shortToBytesLE(shortPageOrdinal);
   
       return concatByteArrays(fileAAD, typeOrdinalBytes, rowGroupOrdinalBytes, 
columnOrdinalBytes, pageOrdinalBytes);
   }
   ```
   
   @pitrou 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to