[jira] [Commented] (KAFKA-4335) FileStreamSource Connector not working for large files (~ 1GB)

2016-10-25 Thread Rahul Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15604522#comment-15604522
 ] 

Rahul Shukla commented on KAFKA-4335:
-

Yes I got this exception on producer console 

java.lang.OutOfMemoryError: Java heap space
at 
org.apache.kafka.connect.file.FileStreamSourceTask.poll(FileStreamSourceTask.java:135)
at 
org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:155)
at 
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:140)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:175)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


> FileStreamSource Connector not working for large files (~ 1GB)
> --
>
> Key: KAFKA-4335
> URL: https://issues.apache.org/jira/browse/KAFKA-4335
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.0.0
>Reporter: Rahul Shukla
>Assignee: Ewen Cheslack-Postava
>
> I was trying to sink large file about (1gb). FileStreamSource connector is 
> not working for that it's working fine for small files.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4335) FileStreamSource Connector not working for large files (~ 1GB)

2016-10-24 Thread Rahul Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15604115#comment-15604115
 ] 

Rahul Shukla commented on KAFKA-4335:
-

It did not throw any exception but not producing content to the topic as well. 
I looked into source code and find that it's trying to read the file in memory 
and then produce the record. Which I believe it's difficult for hold entire 
file in memory. Below is source code snippet which tries to do ...  

int nread = 0;
while (readerCopy.ready()) {
nread = readerCopy.read(buffer, offset, buffer.length - offset);
log.trace("Read {} bytes from {}", nread, logFilename());

if (nread > 0) {
offset += nread;
if (offset == buffer.length) {
char[] newbuf = new char[buffer.length * 2];
System.arraycopy(buffer, 0, newbuf, 0, buffer.length);
buffer = newbuf;
}

String line;
do {
line = extractLine();
if (line != null) {
log.trace("Read a line from {}", logFilename());
if (records == null)
records = new ArrayList<>();
records.add(new SourceRecord(offsetKey(filename), 
offsetValue(streamOffset), topic, VALUE_SCHEMA, line));
}
} while (line != null);
}
}

> FileStreamSource Connector not working for large files (~ 1GB)
> --
>
> Key: KAFKA-4335
> URL: https://issues.apache.org/jira/browse/KAFKA-4335
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.0.0
>Reporter: Rahul Shukla
>Assignee: Ewen Cheslack-Postava
>
> I was trying to sink large file about (1gb). FileStreamSource connector is 
> not working for that it's working fine for small files.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4335) FileStreamSource Connector not working for large files (~ 1GB)

2016-10-24 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15602851#comment-15602851
 ] 

Ewen Cheslack-Postava commented on KAFKA-4335:
--

Can you be more specific about what isn't working? Does it throw an exception 
or some other error?

> FileStreamSource Connector not working for large files (~ 1GB)
> --
>
> Key: KAFKA-4335
> URL: https://issues.apache.org/jira/browse/KAFKA-4335
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.0.0
>Reporter: Rahul Shukla
>Assignee: Ewen Cheslack-Postava
>
> I was trying to sink large file about (1gb). FileStreamSource connector is 
> not working for that it's working fine for small files.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)