rangadi commented on code in PR #38344: URL: https://github.com/apache/spark/pull/38344#discussion_r1006192940
########## connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/ProtobufUtils.scala: ########## @@ -196,30 +194,52 @@ private[sql] object ProtobufUtils extends Logging { fileDescriptorSet = DescriptorProtos.FileDescriptorSet.parseFrom(dscFile) } catch { case ex: InvalidProtocolBufferException => - // TODO move all the exceptions to core/src/main/resources/error/error-classes.json - throw new RuntimeException("Error parsing descriptor byte[] into Descriptor object", ex) + throw QueryCompilationErrors.descrioptorParseError(ex) case ex: IOException => - throw new RuntimeException( - "Error reading Protobuf descriptor file at path: " + - descFilePath, - ex) + throw QueryCompilationErrors.cannotFindDescriptorFileError(descFilePath, ex) } - - val descriptorProto: DescriptorProtos.FileDescriptorProto = fileDescriptorSet.getFile(0) try { - val fileDescriptor: Descriptors.FileDescriptor = Descriptors.FileDescriptor.buildFrom( - descriptorProto, - new Array[Descriptors.FileDescriptor](0)) + val fileDescriptorProtoIndex = createDescriptorProtoMap(fileDescriptorSet) + val fileDescriptor: Descriptors.FileDescriptor = + buildFileDescriptor(fileDescriptorSet.getFileList.asScala.last, fileDescriptorProtoIndex) Review Comment: I think that is pretty restrictive. There will be cases where user wants to use the protobuf defined in imported files. E.g. they might have one proto file jus tot import bunch of other file that define the protobufs. What are we gaining by having restriction to allow messages only in the last listed file? In addition there could be be multiple top level files (i.e. `protoc` command is run with multiple files). I think fixing it properly would be the right thing. We can level a TODO here, I can add that in a follow up. LMK. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org