[GitHub] merlimat commented on a change in pull request #1232: Schema registry (1/4)

2018-03-01 Thread GitBox
merlimat commented on a change in pull request #1232: Schema registry (1/4)
URL: https://github.com/apache/incubator-pulsar/pull/1232#discussion_r171706547
 
 

 ##
 File path: pulsar-common/src/main/proto/PulsarApi.proto
 ##
 @@ -408,6 +408,7 @@ message CommandProducerSuccess {
// The last sequence id that was stored by this producer in the 
previous session
// This will only be meaningful if deduplication has been enabled.
optional int64  last_sequence_id = 3 [default = -1];
+   optional bytes schema_version = 4;
 
 Review comment:
   @mgodave I think we need the `schema_version` in producer_success. That 
should be attached to messages when publishing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] merlimat commented on a change in pull request #1232: Schema registry (1/4)

2018-03-01 Thread GitBox
merlimat commented on a change in pull request #1232: Schema registry (1/4)
URL: https://github.com/apache/incubator-pulsar/pull/1232#discussion_r171651451
 
 

 ##
 File path: pulsar-common/src/main/proto/PulsarApi.proto
 ##
 @@ -22,6 +22,30 @@ package pulsar.proto;
 option java_package = "org.apache.pulsar.common.api.proto";
 option optimize_for = LITE_RUNTIME;
 
+message Schema {
+   enum Type {
+   Json = 1;
+   Protobuf = 2;
+   Thrift = 3;
+   Avro = 4;
+   Reserved1 = 5;
 
 Review comment:
   I don't think we should need these. When deserializing an unknown enum 
value, protobuf returns `null`. 
   
   Also, we typically we use peer protocol version, which is exchanged in the 
`Connect`/`Connected` phase. There we can check if the other side supports 
already a new feature or not.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] merlimat commented on a change in pull request #1232: Schema registry (1/4)

2018-03-01 Thread GitBox
merlimat commented on a change in pull request #1232: Schema registry (1/4)
URL: https://github.com/apache/incubator-pulsar/pull/1232#discussion_r171623263
 
 

 ##
 File path: pulsar-common/src/main/proto/PulsarApi.proto
 ##
 @@ -22,6 +22,13 @@ package pulsar.proto;
 option java_package = "org.apache.pulsar.common.api.proto";
 option optimize_for = LITE_RUNTIME;
 
+message Schema {
+required string name = 1;
+required bytes version = 2;
+required bytes schema_data = 3;
+repeated KeyValue properties = 4;
 
 Review comment:
   Though I think that, when we attach to a message, the `schema_version` field 
is anyway defined as `int64`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] merlimat commented on a change in pull request #1232: Schema registry (1/4)

2018-02-28 Thread GitBox
merlimat commented on a change in pull request #1232: Schema registry (1/4)
URL: https://github.com/apache/incubator-pulsar/pull/1232#discussion_r171418965
 
 

 ##
 File path: pulsar-common/src/main/proto/PulsarApi.proto
 ##
 @@ -197,6 +205,8 @@ message CommandSubscribe {
 repeated KeyValue metadata = 10;
 
 optional bool read_compacted = 11;
+
+   optional int64 schema_version = 12;
 
 Review comment:
   When I subscribe, should I just pass the schema itself? How do I know the 
version that my current code for `MyClass.class` has?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] merlimat commented on a change in pull request #1232: Schema registry (1/4)

2018-02-28 Thread GitBox
merlimat commented on a change in pull request #1232: Schema registry (1/4)
URL: https://github.com/apache/incubator-pulsar/pull/1232#discussion_r171411681
 
 

 ##
 File path: pulsar-common/src/main/proto/PulsarApi.proto
 ##
 @@ -387,6 +400,7 @@ message CommandProducerSuccess {
// The last sequence id that was stored by this producer in the 
previous session
// This will only be meaningful if deduplication has been enabled.
optional int64  last_sequence_id = 3 [default = -1];
+   optional Schema schema = 4;
 
 Review comment:
   @mgodave I think that the schema should be on the `CreateProducer` command, 
but I don't see how it would be used on the `ProducerSuccess`, since it will be 
not useful for the producer at that point.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] merlimat commented on a change in pull request #1232: Schema registry (1/4)

2018-02-28 Thread GitBox
merlimat commented on a change in pull request #1232: Schema registry (1/4)
URL: https://github.com/apache/incubator-pulsar/pull/1232#discussion_r171418590
 
 

 ##
 File path: pulsar-common/src/main/proto/PulsarApi.proto
 ##
 @@ -22,6 +22,13 @@ package pulsar.proto;
 option java_package = "org.apache.pulsar.common.api.proto";
 option optimize_for = LITE_RUNTIME;
 
+message Schema {
+required string name = 1;
+required bytes version = 2;
+required bytes schema_data = 3;
+repeated KeyValue properties = 4;
 
 Review comment:
   I think it was already pointed out, though it's missing now. 
   
1. Shouldn't `version` be a `long` type?
2. Since we are not supporting arbitrary schemas, I think we should 
identify the type of the schema that the client is sending. We should have an 
enum of choices, but use `int` type on the wire to prepare for when adding new 
enum values.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] merlimat commented on a change in pull request #1232: Schema registry (1/4)

2018-02-14 Thread GitBox
merlimat commented on a change in pull request #1232: Schema registry (1/4)
URL: https://github.com/apache/incubator-pulsar/pull/1232#discussion_r168293225
 
 

 ##
 File path: pulsar-common/src/main/proto/PulsarApi.proto
 ##
 @@ -368,6 +387,7 @@ message CommandRedeliverUnacknowledgedMessages {
 
 message CommandSuccess {
required uint64 request_id = 1;
+   optional Schema schema = 2;
 
 Review comment:
   I think it would make sense to have a schema at the namespace level (with 
per-topic override), since we already have a lot of administration tasks at 
that level.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] merlimat commented on a change in pull request #1232: Schema registry (1/4)

2018-02-14 Thread GitBox
merlimat commented on a change in pull request #1232: Schema registry (1/4)
URL: https://github.com/apache/incubator-pulsar/pull/1232#discussion_r168274945
 
 

 ##
 File path: pulsar-common/src/main/proto/PulsarApi.proto
 ##
 @@ -79,8 +86,9 @@ message MessageMetadata {
repeated EncryptionKeys encryption_keys = 13;
// Algorithm used to encrypt data key
optional string encryption_algo = 14;
+   optional bytes schema_version = 15;
// Additional parameters required by encryption
-   optional bytes encryption_param = 15;
+   optional bytes encryption_param = 16;
 
 Review comment:
   We cannot change the integer id of an existing field


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] merlimat commented on a change in pull request #1232: Schema registry (1/4)

2018-02-14 Thread GitBox
merlimat commented on a change in pull request #1232: Schema registry (1/4)
URL: https://github.com/apache/incubator-pulsar/pull/1232#discussion_r168274469
 
 

 ##
 File path: pulsar-common/src/main/proto/PulsarApi.proto
 ##
 @@ -22,6 +22,20 @@ package pulsar.proto;
 option java_package = "org.apache.pulsar.common.api.proto";
 option optimize_for = LITE_RUNTIME;
 
+message Schema {
+   required string name = 1;
+   required bytes version = 2;
+   required bytes schema_data = 3;
 
 Review comment:
   I agree that storing the schema data is agnostic to the format, however 
these protobufs are relative to the interactions between client and broker. 
   
   In this case, the producer might use something like 
`Schema.avroOf(MyClass.class)` or `Schema.jsonOf(MyClass.class)`. In this case, 
 don't we need to communicate to the server what kind of schema the client is 
using?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] merlimat commented on a change in pull request #1232: Schema registry (1/4)

2018-02-14 Thread GitBox
merlimat commented on a change in pull request #1232: Schema registry (1/4)
URL: https://github.com/apache/incubator-pulsar/pull/1232#discussion_r168236331
 
 

 ##
 File path: pulsar-common/src/main/proto/PulsarApi.proto
 ##
 @@ -272,6 +289,8 @@ message CommandProducer {
 
 /// Add optional metadata key=value to this producer
 repeated KeyValue metadata= 6;
+
+   optional int64 schema_version = 7;
 
 Review comment:
   It should carry the whole schema here. The application/producer doesn't 
necessarely know what the version is. 
   
   Eg: if I'm using a type like : 
   
   ```java
   class MyClass {
  public int a;
  public int b;
   };
   ```
   
   When I try to publish, I just know the format I'm about to send, and we can 
extract the avro/json schema out of it. When I try to create producer, I can 
send that information, and the broker can compare with the schemas already 
availalable for this topic. If the schema is know, we just return the schema 
version in the `ProducerSuccess`  response. If it's a new (compatible) version 
of the schema, it will also be stored.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] merlimat commented on a change in pull request #1232: Schema registry (1/4)

2018-02-14 Thread GitBox
merlimat commented on a change in pull request #1232: Schema registry (1/4)
URL: https://github.com/apache/incubator-pulsar/pull/1232#discussion_r168231034
 
 

 ##
 File path: pulsar-common/src/main/proto/PulsarApi.proto
 ##
 @@ -272,6 +289,8 @@ message CommandProducer {
 
 /// Add optional metadata key=value to this producer
 repeated KeyValue metadata= 6;
+
+   optional int64 schema_version = 7;
 
 Review comment:
   Sure, no problem. I think it would be good to have the "final" version of 
the `PulsarApi.proto` to get a sense of all the changes in the interaction 
between client & broker.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] merlimat commented on a change in pull request #1232: Schema registry (1/4)

2018-02-13 Thread GitBox
merlimat commented on a change in pull request #1232: Schema registry (1/4)
URL: https://github.com/apache/incubator-pulsar/pull/1232#discussion_r167959766
 
 

 ##
 File path: pulsar-common/src/main/proto/PulsarApi.proto
 ##
 @@ -22,6 +22,20 @@ package pulsar.proto;
 option java_package = "org.apache.pulsar.common.api.proto";
 option optimize_for = LITE_RUNTIME;
 
+message Schema {
+   required string name = 1;
+   required bytes version = 2;
+   required bytes schema_data = 7;
+repeated KeyValue properties = 8;
+}
 
 Review comment:
   How are we identifying the schema type? (eg: Json vs Avro vs..)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] merlimat commented on a change in pull request #1232: Schema registry (1/4)

2018-02-13 Thread GitBox
merlimat commented on a change in pull request #1232: Schema registry (1/4)
URL: https://github.com/apache/incubator-pulsar/pull/1232#discussion_r167959515
 
 

 ##
 File path: pulsar-common/src/main/proto/PulsarApi.proto
 ##
 @@ -272,6 +289,8 @@ message CommandProducer {
 
 /// Add optional metadata key=value to this producer
 repeated KeyValue metadata= 6;
+
+   optional int64 schema_version = 7;
 
 Review comment:
   I think that when creating producer & consumer we need to specify the actual 
schema so that we can validate on broker side, and if that fails, the producer 
creation fails.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services