On Thu, Nov 19, 2009 at 9:18 PM, Peter Saint-Andre <[email protected]>wrote:
> On 11/18/09 10:48 PM, Ville Varis wrote:
>
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting so bad?
> A: Top-posting.
> Q: What is the most annoying behavior on email discussion lists?
>
> ;-)
>
> > Ok, that is realy the answer for using 'ver' instead of timestamps, true
> > and thanks. Should be seen there, I was thinking too much from
> > algorithmic point of view forgetting some realities :)
> >
> > All the rest from my post, I still agree with that 'ver' should be able
> > to placed by publisher and this feature must be optional. Reasoning for
> > Service to generate 'Ver' exists, and that is good for sure for many
> cases.
> >
> > My main point being there exists also reasoning and use cases to allow
> > publisher to set 'Ver'
>
> What are the use cases for allowing the publisher to set the 'ver'?
> Maybe we need some other tracking device (perhaps a per-item counter so
> that you know this is the 3rd update to a particular item or whatever)
> but I don't see a need to allow the publisher to specify the 'ver' (just
> as we wouldn't allow an IM client to set the 'ver' for the roster).
>
> /psa
>
Simple use case: We have two servers connected to exchange back-end system
creating similar feed, which we are processing and then publishing the
results to thorugh pub/sub nodes.
Let's call these nodes feed1 and feed2.
We have just 1 item (really, not very good exchange we have there.)
ItemID = 1 in both feeds (set by feed publisher)
Now, the Item changes every then and while, propably even several times
within milliseconds, to make conflict happen, we send each update to
subscribers (in real life other things causes the delay in message passing)
Now in Feed1 we get upd
Item=1, ver=F1Generated123 (PubliserVer1)
In Feed2 We get an upd,
Item=1, ver=F2Generated123 (PubliserVer1)
Subscriber receives feed from both sources (for reliability)
All goes fine, as subsciber gets just the same data as update there..
But,
Now in Feed1 we get upd
Item=1, ver=F1Generated456 (PubliserVer2)
Item=1, ver=F1Generated789 (PubliserVer3)
Now in Feed2 we get upd
Item=1, ver=F2Generated456 (PubliserVer2)
Item=1, ver=F2Generated789 (PubliserVer3)
Subscriber receives first both messages from Feed1, everything goes ok.
Then subsciber receives messages from Feed2, and subscirber must update data
_back_ to PublisherVer2 as there is no knowledge would the Item with
'Ver'=F1Generated789 or 'Ver'=F2Generated456 to be valid one currently.
With PublisherVer2 subsciber may notice, the version being older than what
he has and just omits ythe update from Feed2.
And really, this is just a start for reliablity, and recovery.
Exactly same would happen, if Feed3 recovers itself from the file 1 hour
later, then it would generate absolutely different timestamps and 'Ver' As
Feeds 1 and 2 live earlier, no matter if publisher would know the exact
version for item.
And for more, when hot-swappin from Feed1 to Feed2, subsciber can _trust_
that Feed to is exactly the same as Feed1 and it would be enought to
reaquest only updates from the version the subcriber knows. (For smart
recovery, exactly the IetmId,Ver pairs subsciber knows need to be sent to
server and changes to that sent back to client)
Not finnished yet.
With allowing Ver there, we can set for example let sey EuroFeed1 and
EuroFeed2, nad UsaFeed1
UsaFeed1 annot connect to original source due to some restrictions or
whatsoever.
UsaFeed1 can now subscribe as std subsciber (for reliablity) EuroFeed1 and
EuroFeed2 and build it's on publishes to UsaFeed1 from either EuroFeed1 or
EuroFeed2. As long either EuroFeed1 or EuroFeed2 stays there, there is a
feed for Usa. (You can scale it up by changing UsaFeed1 to
FrontFeed01..FrontFeed10).
And at the end you have N number of _same_ feeds available there, which may
act as back ups for each others, failover points, hot-swaps, hot-recovery,
whatever you call it.
The counter for update counts does not work, as you may for example have a
back-end DB storage for data, from where you can generate the correct
HistoryId ('Ver' pls) for each Item for the first Insert. You may also not
be willing to send each and every update from original source, when number
of updates to specific item might vary for the same result for uniq ItemId
'Ver' pair.
At the end. I see IM client and pubsub use cases to be totally different
ones from each others and so the rules for those should be different too.
I think I'm done,
Ville Varis
ps. This kinda 'Ver' (HistoryId) came for us as necessarity after 5 years
using ItemId and pubsliher-service generated timestamp only. Used this and
derivateive algorithms succesfully several years. Really hope I can make my
thoughs clear at some point, if not for 1.13, but propably for the future.