[ 
https://issues.apache.org/jira/browse/HCATALOG-546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493694#comment-13493694
 ] 

Mithun Radhakrishnan commented on HCATALOG-546:
-----------------------------------------------

Hello, Travis. Thanks for the review. (I've updated the patch.)

1. Again, thanks for doing the measurements. The concern isn't so much that 
they're too large now, than that they could be smaller. Given the volume of 
events we expect to be consuming in Oozie, we're expecting that overall gains 
from removing anything that's redundant.
2. A couple of thoughts about thrift:
  a. We'd like to use/perpetuate as little of the thrift-struct definitions in 
our interfaces as viable. At some point, I expect that the thrift bits will be 
replaced. (webhcat comes to mind.) 
  b. We suspect that (language-bindings-wise,) the consumption of JSON message 
strings would be easier than using thrift.
  c. We'd like to deliberately decouple notification-content from the contents 
of the thrift-structs, not just because a lot of it is redundant/queryable, but 
we'd thus have the liberty to introduce new content to an AlterPartitionEvent 
that might not be held in the thrift Partition.
(You're right, though. The Partition might still deserialize correctly even if 
the struct changes. I thought it wouldn't, initially.)
  d. One option could have been to serialize the whole notification message in 
thrift. JSON was just simpler.

I thought I'd mention that after this has stabilized, the next step would be to 
introduce support for logical (and atomic) "sets" of partitions. Serializing 
just the partition-key-vals instead of whole Partition instances would yield 
savings, especially if the sets tend to be large. And then, if the messages are 
going to be persisted (for say querying later), then any space-savings will 
help.

Does that sound alright?
                
> Rework HCatalog's JMS Notifications 
> ------------------------------------
>
>                 Key: HCATALOG-546
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-546
>             Project: HCatalog
>          Issue Type: Bug
>          Components: notification
>    Affects Versions: 0.4.1
>            Reporter: Mithun Radhakrishnan
>            Assignee: Mithun Radhakrishnan
>             Fix For: 0.4.1
>
>         Attachments: HCATALOG-546.patch, sample.Add.Drop.Database.json, 
> sample.Add.Drop.Partition.json, sample.Add.Drop.Table.json
>
>
> In 0.4.1, the NotificationListener listens for metastore operations and emits 
> JMS notifications containing the entire metastore-objects 
> (Database/Table/Partitions) in Java-serialized form. The assumption at the 
> time was that consumers might need access to the whole object. This policy 
> poses a couple of problems:
> 1. The notifications are verbose, since it conveys a bunch of information 
> that's available from querying the metastore anyway.
> 2. Consumers of these JMS notifications (e.g. Oozie) would now be dependent 
> on the Java class definitions of metastore-objects. If they change, Oozie 
> would also need to be restarted (with updated libs), to consume the 
> notifications.
> Ideally, the notifications should convey only the minimum information that 
> identifies the metastore-change unambiguously. (Everything else can be 
> queried for.) They should be backward compatible. If new fields are added, 
> existing consumers shouldn't break (unless they intend to consume the new 
> fields). Also, the notification-format ought to be pluggable.
> For the initial rework, we're proposing to use a JSON-string to represent the 
> notification-content. We're also proposing a helper-class for the likes of 
> Oozie to use, that converts the strings to POJOs, in a backward-compatible 
> fashion.
> I'll attach sample notifications and a tentative patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to