Re: new Metadata format for download url in LLCRealtimeSegmentZKMetadata

TING CHEN Sat, 13 Jun 2020 00:15:06 -0700

My understanding about Jackie's proposal is that in the step (3) below,
segment downloader just fetches from the segment download url no matter
what the value is. As long as the download fails (an empty uri certainly
triggers failure), the peer download path is used as the final fallback (if
peer download optional is enabled). I would say the two methods are similar
though. Adding an empty URI check will slightly improve the performance
while increasing the complexity of the logic (again only marginally).


I am also a bit concerned about how the readers of download urls will
behave with an empty URI. Let me do a code search to find out.


On Fri, Jun 12, 2020 at 10:03 AM Subbu Subramaniam <[email protected]>
wrote:

> OK, I think I see jackie's point. An empty URI simply means that we could
> not get it to store in deep store reliably. Architecturally, that makes
> sense. If a SegmentFetcher wants to fetch the segment and finds empty URI,
> then it needs to look at servers with ExrernalView being ONLINE for the
> segment and fetch it from there.
>
> So, let us do this.
> (1) The segment completion protoocol carries a URI as proposed by Ting.
> (We should not be able to distinguish between an invalid request that may
> be missing a URI vs a request that truly says that the URI is only with
> peers at this point in time).
> (2) Controller commits the metadata with an empty URI if it sees a "peer"
> scheme in the committing segment URI
> (3) Segment downloader assumes that an empty URI == download from peer.
>
> The con side of this is that we need to check everywhere to make sure that
> we are handling null URI in the metadata. It is hard to catch this in
> review, so i would ask Ting to do the due diligence here, and also have a
> couple of reviewers just in case.
>
> Regarding segment download, I have a proposal:
> A. Update the SegmentFetcher interface to introduce a new method that
> takes a list of URIs instead of one URI. Default impl for this method can
> be to call the older API with any of the URIs
> B. Modify the http segment fetcher to try a random URL if a list is given
> (or, cycle through, or whatever).
>
> This makes Ting;s code simpler on the caller side
>
> -Subbu
>
> On 2020/06/12 05:26:41, TING CHEN <[email protected]> wrote:
> > Thanks for your comments. The download url zk Metadata format change is
> > part of the deepstore by-pass proposal. The design aims to improve on
> > status quo for two cases w.r.t deep store config:
> >
> >    1. A deep store configured for Pinot cluster
> >    2. No deep store configured for Pinot -- not for heavy users of Pinot
> >    but still valuable for many small/medium size setup as seen from slack
> >    questions.
> >
> > For case 2, because there is no deep store URI, there is a need for a new
> > download URI format in zkMetaData.
> > For case 1, our design proposal of controller and server interaction in
> > case of deep store outage is similar to your proposal. In particular,
> >
> >    1. When the selected commit server fails to upload a segment *S* to
> the
> >    deep store, it will pass a special uri *U* to the controller.
> >    2. The controller will then save *U *in the down url of a segment
> >    metadata.
> >    3. When another server needs to download *S, *if it sees *U *it
> >    directly downloads from peer servers. Otherwise it first tries to
> download
> >    from deep store and if fails trying to download from peer servers.
> >
> > Note that the above steps work for both case 1 and case 2.
> >
> > To me, your proposal simplifies the logic above in step 3 assuming deep
> > store is configured. I.e., everytime the server needs to download *S, *it
> > always downloads from deepstore first if it fails, downloads *S* from
> peer.
> > My proposal adds a check from the uri format but overall it works with
> both
> > cases regardless if the deep store is configured or not.
> >
> > As for segment download cost from peer servers, I agree it is expensive
> and
> > should be avoided for high traffic clusters. But it is not the main focus
> > of this discussion.
> >
> >
> > On Thu, Jun 11, 2020 at 4:18 PM Xiaotian Jiang <[email protected]>
> wrote:
> >
> > > Let me elaborate more on my proposal:
> > >
> > > The problem we are trying to solve here is when deep storage has lower
> > > availability than Pinot, we should be able to leverage the segment
> > > replications on Pinot server to increase the availability of segment
> > > download.
> > > Here we should first try to download from deep storage, and only if it
> > > fails, we use peer download as the backup.
> > > It is possible that the deep storage URL does not exist because the
> > > segment upload failed, in which case we should download from the
> > > peers.
> > > Notice that in both cases, peer download should be modeled as a backup
> > > plan instead of the main way of downloading segments.
> > > Also, downloading segments from a server requires the server to
> > > compress and send the segments, which is not a cheap operation, and
> > > can cause performance impact on the server. So we should only use peer
> > > download if there is no other option, i.e. as a backup.
> > > If we model peer download as a backup plan, we should not overload the
> > > existing downloadUri to trigger it. Instead, we should try to download
> > > with the downloadUri first, and only if it fails (including the case
> > > where downloadUri does not exist), we try to download from peer.
> > >
> > > On Thu, Jun 11, 2020 at 3:26 PM TING CHEN <[email protected]>
> > > wrote:
> > > >
> > > > Our current design
> > > > <
> > >
> https://cwiki.apache.org/confluence/display/PINOT/By-passing+deep-store+requirement+for+Realtime+segment+completion
> > > >
> > > > does not add a new pinotFS for *peer://*. This is a very interesting
> > > > question though. The only operations our design uses today for peer
> "FS"
> > > is
> > > > essentially *copyToLocal()* in the pinotFS interface. Our design
> > > basically
> > > > has a class and a few supporting methods
> > > > <
> > >
> https://cwiki.apache.org/confluence/display/PINOT/By-passing+deep-store+requirement+for+Realtime+segment+completion#BypassingdeepstorerequirementforRealtimesegmentcompletion-EnablebesteffortsegmentuploadinSplitSegmentCommiteranddownloadsegmentfrompeerservers
> > > .>
> > > > implementing the above method. That is why we do not add a
> full-fledged
> > > > pinotFS subclass for this design.
> > > >
> > > >
> > > > On Thu, Jun 11, 2020 at 3:00 PM kishore g <[email protected]>
> wrote:
> > > >
> > > > > Also, will peer:// have an implementation of pinotFS ?
> > > > >
> > > > > On Thu, Jun 11, 2020 at 2:58 PM kishore g <[email protected]>
> wrote:
> > > > >
> > > > > > +1 peer.
> > > > > >
> > > > > > unrelated to this - do we support multiple URI's?
> > > > > >
> > > > > > On Thu, Jun 11, 2020 at 2:51 PM Subbu Subramaniam <
> > > [email protected]>
> > > > > > wrote:
> > > > > >
> > > > > >> Hey Ting,
> > > > > >>
> > > > > >> I like the URI in metadata as "peer:///segmentName". This way,
> the
> > > URI
> > > > > >> remains parsable and we can use the scheme to check for a
> segment
> > > > > fetcher,
> > > > > >>
> > > > > >> thanks
> > > > > >>
> > > > > >> -Subbu
> > > > > >>
> > > > > >> On 2020/06/10 01:09:25, TING CHEN <[email protected]>
> > > wrote:
> > > > > >> > As part of the deep store by-passing
> > > > > >> > <
> > > > > >>
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/PINOT/By-passing+deep-store+requirement+for+Realtime+segment+completion
> > > > > >> >
> > > > > >> > project, a server is allowed to download segments from peer
> Pinot
> > > > > >> servers
> > > > > >> > during Low Level Consumer (LLC) instead of deep segment
> stores. To
> > > > > >> enable
> > > > > >> > this feature, we plan to add a new URI format for the field
> > > > > >> > *segment.realtime.download.url
> > > > > >> > *in LLCRealtimeSegmentZKMetadata.
> > > > > >> >
> > > > > >> > The new URI format serves the purpose of instructing a Pinot
> > > server to
> > > > > >> find
> > > > > >> > and download the segment from a peer server. Controller
> writes it
> > > to
> > > > > >> Helix
> > > > > >> > in case of segment upload failure or no deep store configured
> at
> > > all.
> > > > > We
> > > > > >> > proposed the following format options and want to hear your
> > > feedback:
> > > > > >> >
> > > > > >> >    1. peer:///segmentName; (my preference)
> > > > > >> >    2. simply an empty string *''*
> > > > > >> >
> > > > > >> > Both are in essence specially markers to indicate that the
> > > segment is
> > > > > >> not
> > > > > >> > found in deep store and servers have to download them from
> peer
> > > > > servers.
> > > > > >> > (1) has the benefit of better readability than (2) for
> debugging
> > > > > >> purposes.
> > > > > >> >
> > > > > >> > Please let me know what you think.
> > > > > >> >
> > > > > >> > Ting Chen
> > > > > >> >
> > > > > >>
> > > > > >>
> > > ---------------------------------------------------------------------
> > > > > >> To unsubscribe, e-mail: [email protected]
> > > > > >> For additional commands, e-mail: [email protected]
> > > > > >>
> > > > > >>
> > > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [email protected]
> > > For additional commands, e-mail: [email protected]
> > >
> > >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: new Metadata format for download url in LLCRealtimeSegmentZKMetadata

Reply via email to