[ 
https://issues.apache.org/jira/browse/TIKA-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16857872#comment-16857872
 ] 

Thomas van Hesteren commented on TIKA-2889:
-------------------------------------------

It uses 12345 by default on start. If 12345 isn't available, it starts at 
60.000 and moves down to the first free port. However, 12345 is almost always 
free. Do you doubt the port number?

The command I use to parse a file:

fd = fopen(FilepathToProcess.c_str(), "rb");
if (fd != 0) {
        CURL *curlTika = curl_easy_init();
        struct curl_slist *LibcurlHeadersTika = NULL;
        if (curlTika) {
                curl_easy_setopt(curlTika, CURLOPT_URL, 
"localhost:12345/rmeta/text");
                LibcurlHeadersTika = curl_slist_append(LibcurlHeadersTika, 
"Expect:");
                LibcurlHeadersTika = curl_slist_append(LibcurlHeadersTika, 
"Accept: application/json");
                curl_easy_setopt(curlTika, CURLOPT_UPLOAD, 1);
                curl_easy_setopt(curlTika, CURLOPT_READDATA, fd); //FILE HANDLE
                curl_easy_setopt(curlTika, CURLOPT_HTTPHEADER, 
LibcurlHeadersTika);
                curl_easy_setopt(curlTika, CURLOPT_INFILESIZE_LARGE, 
(curl_off_t)FileSize); //FILESIZE OF FILEPATH TO PROCESS
                curl_easy_setopt(curlTika, CURLOPT_WRITEFUNCTION, 
LibcurlResponse); //THIS VARIABLE HOLDS THE TIKA OUTPUT AFTER EXECUTING
                curl_easy_setopt(curlTika, CURLOPT_WRITEDATA, 
&CurlResponseFileContent);
                curl_easy_setopt(curlTika, CURLOPT_XFERINFOFUNCTION, 
WatchFileExists); //CHECKS OR FILE EXISTS WHILE PROCESSING. NEVER OCCURRED SO 
FAR
                curl_easy_setopt(curlTika, CURLOPT_NOPROGRESS, false);
                CURLcode curl_code = curl_easy_perform(curlTika);
                curl_easy_cleanup(curlTika);
                curl_slist_free_all(LibcurlHeadersTika);
                fclose(fd);
                if (curl_code == CURLE_OK) {
                        //FILE HANDLED FINE --> CONTINUE
                        }
                else {
                        //LIFE CHECKER CHECKS TIKA. IF NOT RUNNING ANYMORE -> 
REBOOT TIKA
                        }
                }
        }

The command I use to check or Tika is running:

bool TikaIsRunning() {
        CURL *curlTika = curl_easy_init();
        string CurlResponseTika;
        if (curlTika) {
                curl_easy_setopt(curlTika, CURLOPT_URL, "localhost:12345/tika");
                curl_easy_setopt(curlTika, CURLOPT_WRITEFUNCTION, 
LibcurlResponse);
                curl_easy_setopt(curlTika, CURLOPT_WRITEDATA, 
&CurlResponseTika);
                curl_easy_perform(curlTika);
                curl_easy_cleanup(curlTika);
                }
        if (CurlResponseTika.substr(0, 12) == "This is Tika") {
                return true;
                }
        return false;
        }







> Tika Server keeps crashing
> --------------------------
>
>                 Key: TIKA-2889
>                 URL: https://issues.apache.org/jira/browse/TIKA-2889
>             Project: Tika
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 1.18, 1.19, 1.19.1, 1.21
>         Environment: Both Ubuntu and Windows have the same bug/issue
>            Reporter: Thomas van Hesteren
>            Priority: Minor
>         Attachments: log4j.xml, tika-2.log, tika-server-everything-2.log, 
> tika-server-everything.log, tika-server-everything.log, 
> tika-server-everything.log, tika.log, tika.log
>
>
> I have a document processor which sends documents to the Tika Server over 
> cUrl. However, the server crashes multiple times (not document specific). The 
> response I get from cUrl if it happens is as follows:
> Connection error: Couldn't connect to server
>  
> The Tika server is started when the script starts executing. For now, I fixed 
> the issue by making a watcher which restarts the tika server when it crashes. 
> It then processes a few other documents and crashes again (after a few 
> minutes, let's say 5 minutes tops).
>  
> Is there any possibility to catch the exception (if it throws any?)
>  
> A log which shows the crash of the server:
> 04-06-2019 15:49:25|Processing a file of: 52.3kB
> 04-06-2019 15:49:24|Processing a file of: 255.5kB
> 04-06-2019 15:49:24|Processing a file of: 241.6kB
> 04-06-2019 15:49:23|Processing a file of: 37.7kB
> 04-06-2019 15:49:22|Processing a file of: 1.27MB
> 04-06-2019 15:49:21|Processing a file of: 55.8kB
> 04-06-2019 15:49:17|Processing a file of: 114.5kB
> 04-06-2019 15:49:08|Server is not running. Restarting Server. Connection 
> error: Couldn't connect to server
> 04-06-2019 15:49:03|Processing a file of: 41.0kB
> 04-06-2019 15:49:00|Processing a file of: 38.0kB
> 04-06-2019 15:48:59|ProcesPsing a file of: 37.1kB
> 04-06-2019 15:48:59|Processing a file of: 60.2kB
> 04-06-2019 15:48:59|Processing a file of: 280.7kB
> 04-06-2019 15:48:59|Processing a file of: 3.30MB



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to