[GitHub] flink issue #4654: [FLINK-7521] Add config option to set the content length ...

2017-09-11 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4654
  
The actual solution that I'm thinking of currently is to have a modified 
`HttpObjectAggregator` implementation that spills larger messages to disk.


---


[GitHub] flink issue #4654: [FLINK-7521] Add config option to set the content length ...

2017-09-11 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4654
  
removing the object aggregator will break the remaining pipeline, so that 
option is out too.

As for the client upload, that's exactly whats being done right now. The 
jar is uploaded to the blob server, and only the blob key is transmitted via 
REST. The problem here is that this also requires any client to be written in 
java and rely on Flink, whereas we would prefer if any client, written in any 
language, can communicate with the REST APIs.


---


[GitHub] flink issue #4654: [FLINK-7521] Add config option to set the content length ...

2017-09-10 Thread zjureel
Github user zjureel commented on the issue:

https://github.com/apache/flink/pull/4654
  
@kl0u @zentol Thank you for your suggestions. So remove the code 
`.addLast(new HttpObjectAggregator(100*1024*1024)` will be a temporary fix?

In fact it surprises me that the client sends the jar file of job to the 
rest server instead of block file system, such as hdfs. As @zentol said, the 
jars may be multiple 100mb large and we obviously can't allocate that much 
memory in general, I think the client upload the jars to a block file system 
may be a better way, or we should find a way to create a stream service in the 
rest server and save the input stream of jar as soon as possible instead of 
aggregate it in the memory. What do you think? thanks


---


[GitHub] flink issue #4654: [FLINK-7521] Add config option to set the content length ...

2017-09-07 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4654
  
This doesn't solve the problem of the JIRA. For the job submission we have 
to send jars to the cluster, which may be multiple 100mb large; we obviously 
can't allocate that much memory in general.

As @kl0u said, setting this value any higher (it is already _very_ high) 
will cause insane memory consumption in the object aggregator in any case where 
concurrent requests are in progress.

As such this isn't even a temporary fix, as this is not a viable fix for 
the use-cases that are affected by the limit, so I'm afraid we'll have to 
reject this PR.


---


[GitHub] flink issue #4654: [FLINK-7521] Add config option to set the content length ...

2017-09-07 Thread kl0u
Github user kl0u commented on the issue:

https://github.com/apache/flink/pull/4654
  
Hi @zjureel ! Thanks for the work!

This can be a temporary fix, but I was thinking more of a long term one 
where there is no limit. 
The problems that I can find with such temporary solution are: 

1) if we add this as a configuration parameter, and then remove it when we 
have a proper fix, this may result in confusion for the users.

2) with this fix, the aggregator is going to allocate the specified limit 
every time there is an element to parse, even though the element may be small. 
This can be a waste of resources.

What do you think @zentol ?



---


[GitHub] flink issue #4654: [FLINK-7521] Add config option to set the content length ...

2017-09-07 Thread zjureel
Github user zjureel commented on the issue:

https://github.com/apache/flink/pull/4654
  
@kl0u I have tried to fix 
[https://issues.apache.org/jira/browse/FLINK-7521](https://issues.apache.org/jira/browse/FLINK-7521)
 in this PR, could you please have a look when you're free, thanks


---