On Thu, Oct 5, 2017, at 13:33, Michael Pearce wrote: > To me, this is a lot more in line with many other systems connections, to > have the ability to have a single connection string / uri, is this really > that left field suggesting or wanting this? > > If anything this bring kafka more standardised approach imo, to have a > unified resource identifier, protocol name and a set schema for that. > > e.g. > Database connection strings like > > oracle: > jdbc:oracle:thin:@(description=(address_list= > (address=(protocol=tcp)(port=1521)(host=prodHost))) > (connect_data=(INSTANCE_NAME=ORCL)))
Hmm. That isn't a URI, though, right? So adopting URIs doesn't help us integrate with JDBC. In any case, since Kafka is not a database, it is a little unclear what better integration with JDBC would look like. Perhaps that is worth thinking about at some point, but it seems unrelated to this URI discussion. > On 05/10/2017, 20:10, "Clebert Suconic" <clebert.suco...@gmail.com> > wrote: > > On Thu, Oct 5, 2017 at 2:20 PM, Colin McCabe <cmcc...@apache.org> > wrote: > > We used URIs as file paths in Hadoop. I think it was a mistake, for a > > few different reasons. > > > > URIs are actually very complex. You probably know about scheme, host, > > and port, but did you know about authority, user-info, query, fragment, > > scheme-specific-part? Do you know what they do in Hadoop? The mapping > > isn't obvious (and it wouldn't be obvious in Kafka either). > > URIs are just a hashmap of key=string.. just like Properties... You really can't treat a URI as a hashmap. For one thing, the scheme and hostname parts are not optional. You are probably thinking of the "query" part (the part after the question mark). This isn't map either-- it's a sequence of comma-separated key=value pairs. The same key can appear multiple times. And you have to encode everything with RFC3986 "percent encoding." > The Consumer and Producer is just having such hashMap.. and these > values are easy to translate to boolean, integer.. etc. We would just > need to add such mapping as part of this task when done. I don't see > anything difficult there. I don't object to having some kind of connection string that rolls up all the configuration properties. I just don't think it should be a URI. > > > > When you flip back and forth between URIs and strings (and you > > inevitably will do this, when serializing or sending things over the > > wire), you run into tons of really hard problems. Should you preserve > > the "fragment" (the thing after the hash mark) for your URI, or not? It > > may not do anything now, but maybe it will do something later. URIs > > also have complex string escaping rules. Parsing URIs is very messy, > > especially when you start talking about non-Java programming languages. > > > Why flip back and forth? URIs would generate the same HashMap that's > being generated today.. I don't see any mess here. > Besides... This would be an addition, not replacement... > > And I'm talking only about the Java API now. We have a lot of non-Java clients-- those should be part of the discussion. > > Again, All the properties on ProducerConfig and ConsumerConfig seems > easy to be mapped as primitive types (String, numbers.. booleans). > > Serialization shouldn't be a problem there. it would generate the > same > properties it's generated now. > > > > > URIs are designed for a world where you talk to a single host over a > > single port. That isn't the world distributed systems live in. You > > don't want your clients to fail to bootstrap because the single server > > you specified is having a bad day, even when the other 8 servers are up. > > I have seen a few projects using this style of URI: I would make it > doing the same here: > > If you have multiple hosts: > > KafkaConsumer consumer = new > > KafkaConsumer("kafka:(kafka://host1:port,kafka://host2:port)?property1=value"); That's not a valid URI? > > if you have a single host: > KafkaConsumer consumer = new > KafkaConsumer("kafka://host2:port?property1=value&property2=value2"); > > > One example of an apache project using a similar approach is > qpid-jms: > > http://qpid.apache.org/releases/qpid-jms-0.25.0/docs/index.html#failover-configuration-options > > > > The bottom line is that URIs are the wrong abstraction for the job. > > They just don't express what we really want, and they introduce a lot of > > complexity and ambiguity. > > I have seen the opposite to be honest. this has been simpler for me > and users I know than using a HashMap.. . users in my experience > tend > to write this faster. Users tend to find make mistakes when writing URIs. For example, how do you translate a filename with spaces and commas into a URI? I had to debug these issues. It is why I dislike URIs. As I said before, a connection string might be a good idea. A URI, no. best, Colin > > users can certainly put up with the HashMap.. but this is easier to > remember. I'm just proposing what I think it's a simpler API. > > > > > Perhaps we should move into the KIP discussion itself here.. I first > intended to start this thread to see if it would make sense or not... > But I don't have authorization to create the KIP page.. so again.. > based on the contributing page.. can someone add me authorizations to > the WIKI space? > > > The information contained in this email is strictly confidential and for > the use of the addressee only, unless otherwise indicated. If you are not > the intended recipient, please do not read, copy, use or disclose to > others this message or any attachment. Please also notify the sender by > replying to this email or by telephone (+44(020 7896 0011) and then > delete the email and any copies of it. Opinions, conclusion (etc) that do > not relate to the official business of this company shall be understood > as neither given nor endorsed by it. IG is a trading name of IG Markets > Limited (a company registered in England and Wales, company number > 04008957) and IG Index Limited (a company registered in England and > Wales, company number 01190902). Registered address at Cannon Bridge > House, 25 Dowgate Hill, London EC4R 2YA. Both IG Markets Limited > (register number 195355) and IG Index Limited (register number 114059) > are authorised and regulated by the Financial Conduct Authority.