[ https://issues.apache.org/jira/browse/SOLR-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16307584#comment-16307584 ]
Gus Heck commented on SOLR-11653: --------------------------------- Some thoughts: It seems that you've added ROUTEDALIAS_CREATECOLL as a user accessible command (the v1 API will respond to it I think) but not reflected in v2 api json files. I think this is because this command is probably not meant for API invocation in the first place, so it kinda looks out of place as an undocumented API. I kind of wonder why it was done as an admin command. I'm probably missing something, but it seems like this means we have: # doc arrives for which it is appropriate to create a new collection. # issue admin command ROUTEDALIAS_CREATECOLL and wait for it # inside ROUTEDALIAS_CREATECOLL issue CollectionsHandler.CollectionOperation.CREATE_OP.execute(... and wait for that # finally process the update I'm not sure why we want the extra layer? is there actually a use case for manual creation of the next partition? To me it seems as if this operation is internal to TimeRoutedAliasUpdateProcessor and should be there. I can possibly imagine that if we were proactively keeping ahead of things by one collection this structure could allow fast processing of the update by giving it an async id, but it looks like the code is set up to add a collection the first time it gets a document that requires that collection (up to maxFutureMs), and delay the update until that succeeds. This makes me wonder if we should even be supporting +1SECOND? If this command is not sub-second we fall behind... That probably also is a really bad idea for sheer number of collections too. As a side benefit we could also shorten our collection names too... I think your loop for creating collections should be dependent on the value of maxFutureMs? Let's say now() is 01:22 the head is presently ending in _01_00 and accepting 01:00 hrs to 02:00 hrs, maxFutureMs=72*60*60*1000 and we get a document for 60 hours in the future... looks like we loop up to 5 times and successfully create the _02_00, 0_03_00, _04_00,_05_00,_06_00 partitions and then error out unless we get an update to our parsedCollectionAliases in the mean time... if any 5 creations are faster than the updates to parsedCollectionAliases we wrongly error out. The error message will be misleading too since the attempts to create the collections all succeeded but we quit due to some sort of communications lag. IF the parsedCollectionsAliases gets updated in time we will reset our counter and keep going, but why not directly calculate the appropriate minimum number of attempts from DocTimestampMs - HeadTimestampMs divided by HeadTimestampMs+(one interval) - HeadTimestampMs? A constant can be added if we want some slack for retrying failed calls. Perhaps my work in SOLR-11722 should be filling out all prospective next collections up to maxFutureMs (which I need to add to metadata)? Then in this code rather than testing the current document's time stamp to see if we create a new collection, test whether or not the next increment falls within MaxFutureMS. In the case that it's time to create a new collection we can then check the document value and create the next one asynchronously unless the current document would fall in the yet to be created collection (which should normally be a rare case). That would go from one guaranteed slow update every hour (for +1HOUR routed aliases) to only having a slow update if a document at the very beginning of an hour happens to fall *just* inside maxFutureMs. We would need to track that the collection was in progress for creation of course to avoid spamming the overseer each hour... It seems easier and more robust to never need to create more than one collection during an update. In that case your present loop is just fine. In code, you have variable names and comments talking about "head" but I had to dig a little to confirm that this was actually the "most recent" not "the first".... be nice if comments made this clear. I agree with your desire to rename handleResponse() ... confused me at first. > create next time collection based on a fixed time gap > ----------------------------------------------------- > > Key: SOLR-11653 > URL: https://issues.apache.org/jira/browse/SOLR-11653 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud > Reporter: David Smiley > Assignee: David Smiley > Attachments: SOLR-11653.patch, SOLR-11653.patch > > > For time series collections (as part of a collection Alias with certain > metadata), we want to automatically add new collections. In this issue, this > is about creating the next collection based on a configurable fixed time gap. > And we will also add this collection synchronously once a document flowing > through the URP chain exceeds the gap, as opposed to asynchronously in > advance. There will be some Alias metadata to define in this issue. The > preponderance of the implementation will be in TimePartitionedUpdateProcessor > or perhaps a helper to this URP. > note: other issues will implement pre-emptive creation and capping > collections by size. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org