juliuszsompolski commented on issue #26340: [WIP][SPARK-29108][SQL] Add new module sql/thriftserver with all code and UT URL: https://github.com/apache/spark/pull/26340#issuecomment-548830411 > Add ThriftServerDelagationTokenManager and implement a method verifyToken since hive 1.2. don't have it > Remove unused Table Type class as I mentioned before. ClassicTypeMapping, HiveTableTypeMapping, TableTypeMapping and TableTypeMappingFactory > Add VariableSubstitution since it's different in hive 1.2.1 and hive 2.3.5 > Change package name These parts are needed to get it to work in an initial PR, but could we keep the others to followup PRs? I think each of them deserves a separate focused review: > return class StructType etc In this PR, the change to StructType propagates through all layers, all the way to ThriftCLIService, and only then it's changed into the thrift format by the SchemaMapper. It would be good to see where are all the changes propagated by that, and if maybe keeping TableSchema and just using SchemaMapper in more places to limit these changes makes sense. > Implement RowSet classes I am looking forward to this change, as let us skip one format conversion step in serialization, but I think it deserves some performance benchmarks to see what is the impact. For ColumnBasedSet it can for example have performance impact whether we do "foreach row, add to each column" or "foreach column, construct the Tcolumn from all rows". Also now both RowBasedSet and ColumnBasedSet internally keep SparkRows, and only at the end convert all rows to Thrift structures, while before it was converting on the fly as the rows were added. > integrate HiveServer2 and SparkThriftServer This could use a detailed review to see that there are no behaviour changes. > Some implement class convert to scala such as OerationManager, SessionManager etc.. I think this should be reviewed in two separate PRs * remove the Hive Operation classes that we override like ExecuteStatementOperation, GetTablesOperation etc... and make Spark operation inherit from Operation directly - when doing that it would be good to see and review what shared code can be moved to the base class (like e.g. all operations needing to execute with correct scheduler pool; exception handling etc.) * rewrite the manager classes like OperationManager, SessionManager - again, review what shared code can be moved to the base class... In both cases I think it also deserves a discussion whether the base class should be translated to scala or stay in Java.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
