[GitHub] flink issue #6043: [FLINK-7386] evolve RequestIndexer API to make it working...

2018-07-19 Thread tzulitai
Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/6043
  
@cjolif no problem, thanks for the notice. I'll try to incorporate the 
changes you mentioned above to the previous work you've already done. Thanks a 
lot!


---


[GitHub] flink issue #6043: [FLINK-7386] evolve RequestIndexer API to make it working...

2018-07-19 Thread cjolif
Github user cjolif commented on the issue:

https://github.com/apache/flink/pull/6043
  
@tzulitai I'm unfortunately totally buried under work at the moment so I 
don't feel like I will have time in such a short deadline :( Sorry about that. 
If for some reason more delays is added please let me know again and I will see 
what I can do? Otherwise I should have time to do a quick review of whatever 
someone else would be doing.


---


[GitHub] flink issue #6043: [FLINK-7386] evolve RequestIndexer API to make it working...

2018-07-19 Thread tzulitai
Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/6043
  
@cjolif Ideally we have ES 6.x connector merged by the beginning of next 
week. Let me know if this is possible for you. I'll proceed to merge this PR 
first.


---


[GitHub] flink issue #6043: [FLINK-7386] evolve RequestIndexer API to make it working...

2018-07-19 Thread tzulitai
Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/6043
  
@cjolif do you think you would be able to quickly open a PR for the REST 
6.x connector that includes the new changes you mentioned, based on this one?


---


[GitHub] flink issue #6043: [FLINK-7386] evolve RequestIndexer API to make it working...

2018-07-18 Thread cjolif
Github user cjolif commented on the issue:

https://github.com/apache/flink/pull/6043
  
@tzulitai 

Happy to see progress is made on this!

> After merging this, I'll also try cherry-picking your 6.x REST-based ES 
connector on top. If that works well, will also merge that.

Note that since the initial ES PR (#5374 ) I made a couple of changes in 
our own copy of this. 

1. Elasticsearch REST API can have a context root in addition the to list 
of httpHosts, so I added the ability to have prefixPath, and calling:

```java
final RestClientBuilder builder = 
RestClient.builder(httpHosts.toArray(new HttpHost[httpHosts.size()]));

if (pathPrefix != null && !pathPrefix.isEmpty() && 
!pathPrefix.equals("/")) {
  builder.setPathPrefix(this.pathPrefix);
}
```

So that is set on the builder.

2. Elasticsearch REST can be protected by login/password so I added the 
ability to set username/password:

```java
  private CredentialsProvider getCredentialProvider() {
CredentialsProvider credentialsProvider = null;
if (userConfig.containsKey(CONFIG_KEY_ES_USERNAME) && 
userConfig.containsKey(CONFIG_KEY_ES_PASSWORD)) {
  credentialsProvider = new BasicCredentialsProvider();
  credentialsProvider.setCredentials(AuthScope.ANY,
  new 
UsernamePasswordCredentials(userConfig.get(CONFIG_KEY_ES_USERNAME), 
userConfig.get(CONFIG_KEY_ES_PASSWORD)));
}
return credentialsProvider;
  }
```
and 
then
```
builder.setHttpClientConfigCallback(httpClientBuilder ->
httpClientBuilder
.setDefaultCredentialsProvider(getCredentialProvider()))
```

More generally it should be easy for the user to change how the builder is 
configure to make sure people can customize this as they want (like configure 
SSL...).


---


[GitHub] flink issue #6043: [FLINK-7386] evolve RequestIndexer API to make it working...

2018-07-18 Thread tzulitai
Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/6043
  
I took another look at the PR, and also talked with @tillrohrmann about 
merging this for 1.6.
I think this LGTM, and with these changes we will at least have an ES 5.x 
connector that is 5.3+ compatible.

Merging ..

After merging this, I'll also try cherry-picking your 6.x REST-based ES 
connector on top. If that works well, will also merge that.


---


[GitHub] flink issue #6043: [FLINK-7386] evolve RequestIndexer API to make it working...

2018-06-03 Thread cjolif
Github user cjolif commented on the issue:

https://github.com/apache/flink/pull/6043
  
> @cjolif do you think it would be possible that with a clean cut using a 
REST implementation, we no longer need to have separate modules anymore for ES 
6.x, 7.x, 8.x or so on?
i.e., it would only be a matter for the user of recompiling that REST-based 
implementation with a different ES client version.

@tzulitai I guess, in theory, there is alway a risk Elasticsearch breaks 
the compatibility between two major versions again even on the High Level REST 
Client APIs... My feeling is they are now trying to avoid that. But I did not 
find any wording that would allow us to "rely" on that "for sure".


---


[GitHub] flink issue #6043: [FLINK-7386] evolve RequestIndexer API to make it working...

2018-05-30 Thread tzulitai
Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/6043
  
@cjolif do you think it would be possible that with a clean cut using a 
REST implementation, we no longer need to have separate modules anymore for ES 
6.x, 7.x, 8.x or so on?
i.e., it would only be a matter for the user of recompiling that REST-based 
implementation with a different ES client version.

If no, then I would still prefer that we continue with the current approach 
this PR is proposing, since we need this fix in to have Elasticsearch 5.3+ 
working anyways.


---


[GitHub] flink issue #6043: [FLINK-7386] evolve RequestIndexer API to make it working...

2018-05-29 Thread cjolif
Github user cjolif commented on the issue:

https://github.com/apache/flink/pull/6043
  
What I could do if that can help making progress, is to create a second PR 
based on this one and introducing the RestHighLevelClient implementation based 
on those APIs? This would allow to get an idea of what we would get?  


---


[GitHub] flink issue #6043: [FLINK-7386] evolve RequestIndexer API to make it working...

2018-05-24 Thread cjolif
Github user cjolif commented on the issue:

https://github.com/apache/flink/pull/6043
  
> When planning to switch to REST, are we speaking of an implementation 
that works directly against Elasticsearch's REST API? Or are we thinking of 
using Elasticsearch's RestHighLevelClient?

To me at least, yes, when I say REST I mean RestHighLevelClient.


---


[GitHub] flink issue #6043: [FLINK-7386] evolve RequestIndexer API to make it working...

2018-05-24 Thread tzulitai
Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/6043
  
One more thing to clarify:
When planning to switch to REST, are we speaking of an implementation that 
works directly against Elasticsearch's REST API? Or are we thinking of using 
Elasticsearch's 
[RestHighLevelClient](https://snapshots.elastic.co/javadoc/org/elasticsearch/client/elasticsearch-rest-high-level-client/7.0.0-alpha1-SNAPSHOT/org/elasticsearch/client/RestHighLevelClient.html)?

I would assume the latter, but IMO we would not be able to avoid yet again 
having a common base module across future versions (e.g. across ES 6.x, 7.x, 
and so on), even if we make a clean cut now.
So, I have the feeling that the main problem here isn't that we are sharing 
code between versions, but the fact that our base shared code isn't 
future-proof enough for potential 3rd party API breaks.

That's the main reason why I'm proposing not to expose Elasticsearch 
classes anymore through base class APIs in the shared module.


---


[GitHub] flink issue #6043: [FLINK-7386] evolve RequestIndexer API to make it working...

2018-05-24 Thread tzulitai
Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/6043
  
The main reason why the discussion leaned towards the current proposed 
change by this PR, was that only Elasticsearch 5.6+ supports REST.

Only working towards a clean-cut module that uses REST, would mean that we 
still wouldn't be able to support Elasticsearch 5.2+ up to Elasticsearch 5.5.


---


[GitHub] flink issue #6043: [FLINK-7386] evolve RequestIndexer API to make it working...

2018-05-24 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/6043
  
@cjolif I agree, let's do something here.

@tzulitai what do you think about trying to use the switch to REST to make 
a clean cut and start a new connector project (without dependency on 
`flink-connector-elasticsearch-base`). As an experiment, we could try how much 
code we would actually need to copy into the new project.

@aljoscha and @patricklucas I remember you also had some thoughts on the 
elasticsearch connectors.

I am +1 for seeing if we can drop ElasticSearch 1.x and 2.x support, but 
that should be a separate thread.


---


[GitHub] flink issue #6043: [FLINK-7386] evolve RequestIndexer API to make it working...

2018-05-24 Thread cjolif
Github user cjolif commented on the issue:

https://github.com/apache/flink/pull/6043
  
Overall I think the most important thing to do is do something :) We can't 
let Flink elasticsearch in the broken state they are today. There must either 
be a purely REST-based solution or make sure the current code path is working 
and allowing people to build a REST-based solution if they want to on top of it.


---


[GitHub] flink issue #6043: [FLINK-7386] evolve RequestIndexer API to make it working...

2018-05-24 Thread cjolif
Github user cjolif commented on the issue:

https://github.com/apache/flink/pull/6043
  
@StephanEwen that is I think more or less what I proposed here: 
https://lists.apache.org/thread.html/e7f584e0df9ce510fa5bee8d3a7e59b6df885f7deee36710a1cbb9b1@%3Cdev.flink.apache.org%3E

"In the hope of moving that forward I would like to propose for 1.6 a new
Elasticsearch 6.x+ sink that would follow the design of the previous ones
BUT only leverage the new REST API and not inherit from existing classes."

But @tzulitai hinted into a different direction, that I followed for this 
PR.

Personally I think both approaches make sense. I don't have a strong 
opinion. Even though for my personal use-cases just doing a separated REST 
API-based sink would be enough.


---


[GitHub] flink issue #6043: [FLINK-7386] evolve RequestIndexer API to make it working...

2018-05-24 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/6043
  
As a high-level comment, I think we may want start making the ElasticSearch 
connectors projects independent of each other.

We previously tried to share code between versions, which has made things 
clumsy both from the dependency management and the implementation (api bridges, 
etc.). It also couples different versions, such that a bug fix in one connector 
version often affects other connectors as well.

The REST-based client may be a good time to start clean, create a new 
project with no dependencies on the base connector project, and copy the 
necessary code over.

What do you think?


---