> (Correct me if I'm wrong Matt, but as I recall, UCX addresses aren't hostnames but rather opaque byte blobs, for instance.)
You can use a hostname and port to create a ucx connection, but there is separately an address object. A UCX address object is an opaque byte blob that includes a whole mess of available transport and config information. Using this instead of the host/port string allows skipping that config exchange during connection basically. I'd be okay with the + convention as an idea. On Mon, Feb 12, 2024, 5:40 PM David Li <lidav...@apache.org> wrote: > The idea is that the client would reuse the existing connection, in which > case the protocol and such are implicit. (If the client doesn't have a > connection anymore, it can't use the fallback anyways.) > > I suppose this has the advantage that you could "fall back" to a known > hostname with a different protocol, but I'm not sure that always applies > anyways. (Correct me if I'm wrong Matt, but as I recall, UCX addresses > aren't hostnames but rather opaque byte blobs, for instance.) > > If we do prefer this, to avoid overloading the hostname, there's also the > informal convention of using + in the scheme, so it could be > arrow-flight-fallback+grpc+tls://, arrow-flight-fallback+http://, etc. > > On Mon, Feb 12, 2024, at 17:03, Joel Lubinitsky wrote: > > Thanks for clarifying. > > > > Given the relationship between these two proposals, would it also be > > necessary to distinguish the scheme (or schemes) supported by the > > originating Flight RPC service? > > > > If that is the case, it may be preferred to use the "host" portion of the > > URI rather than the "scheme" to denote the location of the data. In this > > scenario, the host "0.0.0.0" could be used. This IP address is defined in > > IETF RFC1122 [1] as "This host on this network", which seems most > > consistent with the intended use-case. There are some caveats to this > usage > > but in my experience it's not uncommon for protocols to extend the > > definition of this address in their own usage. > > > > A benefit of this convention is that the scheme remains available in the > > URI to specify the transport available. For example, the following list > of > > locations may be included in the response: > > > > ["grpc://0.0.0.0", "ucx://0.0.0.0", "grpc://1.2.3.4", > <other_locations>...] > > > > This would indicate that grpc and ucx transport is available from the > > current service, grpc is available at 1.2.3.4, and possibly more > > combinations of scheme/host. > > > > [1] https://datatracker.ietf.org/doc/html/rfc1122#section-3.2.1.3 > > > > On Mon, Feb 12, 2024 at 2:53 PM David Li <lidav...@apache.org> wrote: > > > >> Ah, while I was thinking of it as useful for a fallback, I'm not > >> specifying it that way. Better ideas for names would be appreciated. > >> > >> The actual precedence has never been specified. All endpoints are > >> equivalent, so clients may use what is "best". For instance, with Matt > >> Topol's concurrent proposal, a GPU-enabled client may preferentially try > >> UCX endpoints while other clients may choose to ignore them entirely > (e.g. > >> because they don't have UCX installed). > >> > >> In practice the ADBC/JDBC drivers just scan the list left to right and > try > >> each endpoint in turn for lack of a better heuristic. > >> > >> On Mon, Feb 12, 2024, at 14:28, Joel Lubinitsky wrote: > >> > Thanks for proposing this David. > >> > > >> > I think the ability to include the Flight RPC service itself in the > list > >> of > >> > endpoints from which data can be fetched is a helpful addition. > >> > > >> > The current choice of name for the URI (arrow-flight-fallback://) > seems > >> to > >> > imply that there is an order of precedence that should be considered > in > >> the > >> > list of URI’s. Specifically, as a developer receiving the list of > >> locations > >> > I might assume that I should try fetching from other locations first. > If > >> > those do not succeed, I may try the original service as a fallback. > >> > > >> > Are these the intended semantics? If so, is there a way to include the > >> > original service in the list of locations without the implied > precedence? > >> > > >> > Thanks, > >> > Joel > >> > > >> > On Mon, Feb 12, 2024 at 11:52 James Duong <james.du...@improving.com > >> .invalid> > >> > wrote: > >> > > >> >> This seems like a good idea, and also improves consistency with > clients > >> >> that erroneously assumed that the service endpoint was always in the > >> list > >> >> of endpoints. > >> >> > >> >> From: Antoine Pitrou <anto...@python.org> > >> >> Date: Monday, February 12, 2024 at 6:05 AM > >> >> To: dev@arrow.apache.org <dev@arrow.apache.org> > >> >> Subject: Re: [DISCUSS] Flight RPC: add 'fallback' URI scheme > >> >> > >> >> Hello, > >> >> > >> >> This looks fine to me. > >> >> > >> >> Regards > >> >> > >> >> Antoine. > >> >> > >> >> > >> >> Le 12/02/2024 à 14:46, David Li a écrit : > >> >> > Hello, > >> >> > > >> >> > I'd like to propose a slight update to Flight RPC to make Flight > SQL > >> >> work better in different deployment scenarios. Comments on the doc > >> would > >> >> be appreciated: > >> >> > > >> >> > > >> >> > >> > https://docs.google.com/document/d/1g9M9FmsZhkewlT1mLibuceQO8ugI0-fqumVAXKFjVGg/edit?usp=sharing > >> >> > > >> >> > The gist is that FlightEndpoint allows specifying either (1) a > list of > >> >> concrete URIs to fetch data from or (2) no URIs, meaning to fetch > from > >> the > >> >> Flight RPC service itself; but it would be useful to combine both > >> behaviors > >> >> (try these concrete URIs and fall back to the Flight RPC service > itself) > >> >> without requiring the service to know its own public address. > >> >> > > >> >> > Best, > >> >> > David > >> >> > >> >