Nevermind. It seems the problem was that heka process was killed at some point... so the logstreamerinput's file anchor was somewhere in the middle of the message. Without resetting (sudo rm /var/cache/hekad/* -rf) the pointers next time I run hekad it started off where it stopped - in the middle of the message.
From: Heka [mailto:[email protected]] On Behalf Of Gołębiewski Piotr DRA-BRB-ZIS Sent: Friday, July 17, 2015 11:19 AM To: [email protected] Subject: [heka] [RegexSplitter] not splitting data into correct payloads? Hi, I've got some logs in format: yyyy-MM-ddTHH:mm:ssZ (Some header #1) {below multiline message body, eg. PHP array dump} yyyy-MM-ddTHH:mm:ssZ (Some header #2) {below multiline message body, eg. PHP array dump} Example: 2015-01-02T13:14:15+01:00 DB QUERY: SELECT id, name FROM table WHERE id = 15 Namespace\Class:method /path/to/file/calleing/log/Class:123 2015-01-02T13:14:15+01:00 DB QUERY: SELECT id, somethingelse FROM table2 WHERE somethingelse = 'foobar' Namespace\Class:method /path/to/file/calleing/log/Class:123 Another Example (diffrent log, similar problem): 2015-01-02T13:14:15+01:00 REQUEST -> http://123.45.67.89:0123 array ( 0 => 'get_something_action', 1 => array ( 'params' => array ( 'someparam' => 'foo', 'otherparam' => 'bar, ) ) ) 2015-01-02T13:14:15+01:00 RESPONSE <- http://123.45.67.89:0123 array ( 'state' => 'ok', 'params' => array ( 'some_return_param' => 'foo_bar_baz' ) ) I am useing RegexSplitter with delimiter configured like this: * delimiter = '(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}[+-]\d{2}:\d{2} DB QUERY:)\n' * delimiter = '(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}[+-]\d{2}:\d{2} \S+ [<-][->] \S+.*)\n' * delimiter = '(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}[+-]\d{2}:\d{2} .+)\n' (Diffrent delimiters are for diffrent logs, but their structure is the same - the only diffrence is the "header" following the timestamp). When I run hekad (with PayloadEncoder/LogOutput) I am getting alerts: (note: empty lines = successful messages, becouse I have the "payload_keep" param set to false) (note2: failed parsing empty payload is also OK - it means the log file begins with an empty line, which is not a valid message) Example output: 2015/07/17 10:44:51 2015/07/17 10:44:51 2015/07/17 10:44:51 Decoder 'AcmeNLogInput-AcmeNLogDecoder-23' error: Failed parsing: payload: 2015/07/17 10:44:51 Decoder 'AcmeDBLogInput-AcmeDBLogDecoder-6' error: Failed parsing: payload: 15-04-12T15:06:45+02:00 DB QUERY: select * from session where token = :token {":token":"foobar"} Acme\Database::isSessionExist /path/to/src/Acme/Database.php:123 2015/07/17 10:44:51 2015/07/17 10:44:51 Decoder 'AcmeDBLogInput-AcmeDBLogDecoder-5' error: Failed parsing: payload: _name, hash FROM devices WHERE device_hash = :hash AND client_name = :name AND ip_address = :ip {":ip":"123.45.67.89",":hash":"foobarbaz",":name":"ACME-01-NYC"} Acme\Database::getDevice /path/to/src/Acme/Database.php:456 2015/07/17 10:44:51 2015/07/17 10:44:51 Decoder 'AcmeDBLogInput-AcmeDBLogDecoder-14' error: Failed parsing: payload: arams /path/to/src/Acme/Database.php:234 2015/07/17 10:44:51 2015/07/17 10:44:51 Decoder 'AcmeDBLogInput-AcmeDBLogDecoder-9' error: Failed parsing: payload: _hash FROM devices WHERE device_hash = :hash AND client_name = :name AND ip_address = :ip {":ip":"98.76.54.32",":hash":"barfoobaz",":name":"ACME-01-NYC"} Acme\Database::getDevice /path/to/src/Acme/Database.php:456 My question is: * why does the payload start with "arams /path/to/src/Acme/Database (...)" or "_hash (...)" or "15-04-12T15:06:45+02:00 DB QUERY (...)"? it should ALWAYS start with yyyy-MM-ddTHH:mm:ss {Some Header} - all decoders start with timestamp pattern and `delimiter_eol` is set to false (so it should prepend the payload) * I've found these messages in my log files, and they are no diffrent (in format/pattern) from others It seems like sometimes (?) the payload (for some reason) gets stripped? My hekad config: [hekad] max_message_size = 15728640 # 15MB maxprocs = 4 poolsize = 10 Example Splitter config: [AcmeMainLogSplitter] type = "RegexSplitter" delimiter = '(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}[+-]\d{2}:\d{2} DEVICE:.*)\n' delimiter_eol = false Example Decoder config: [AcmeMainLogDecoder] type = "SandboxDecoder" filename = "/etc/heka/lua_decoders/acme_main_log.lua" memory_limit = 15728640 # 15MB output_limit = 15728640 # 15MB instruction_limit = 1500000 # 1.5 million Any ideas why this happens? --------------------------------------------- Siedziba: Getin Noble Bank SA, ul. Przyokopowa 33, 01-208 Warszawa Sad rejestrowy: Sad Rejonowy dla m.st. Warszawy w Warszawie, XII Wydzial Gospodarczy. Numer KRS: 0000304735 NIP: 108-000-48-50 Wysokosc kapitalu zakladowego oplaconego w calosci: 2 650 143 319,00 zl Zamieszczenie powyzszych danych identyfikujacych Getin Noble Bank SA stosownie do art. 374 par.1 Kodeksu spolek handlowych nie jest rownoznaczne z handlowym charakterem dostarczonej do Panstwa wiadomosci e-mailowej i pozostaje bez wplywu na interpretacje zawartych w niej oswiadczen. Niniejszy e-mail oraz wszelkie zalaczone do niego pliki sa poufne i moga podlegac ochronie prawnej. Jezeli nie jest Pan/Pani zamierzonym adresatem powyzszej wiadomosci, nie moze jej Pan/Pani ujawniac, kopiowac, dystrybuowac, ani tez w zaden inny sposob udostepniac lub wykorzystywac. O blednym zaadresowaniu wiadomosci prosimy niezwlocznie poinformowac nadawce i usunac wiadomosc. This e-mail message may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. --------------------------------------------- Siedziba: Getin Noble Bank SA, ul. Przyokopowa 33, 01-208 Warszawa Sad rejestrowy: Sad Rejonowy dla m.st. Warszawy w Warszawie, XII Wydzial Gospodarczy. Numer KRS: 0000304735 NIP: 108-000-48-50 Wysokosc kapitalu zakladowego oplaconego w calosci: 2 650 143 319,00 zl Zamieszczenie powyzszych danych identyfikujacych Getin Noble Bank SA stosownie do art. 374 par.1 Kodeksu spolek handlowych nie jest rownoznaczne z handlowym charakterem dostarczonej do Panstwa wiadomosci e-mailowej i pozostaje bez wplywu na interpretacje zawartych w niej oswiadczen.********** Niniejszy e-mail oraz wszelkie zalaczone do niego pliki sa poufne i moga podlegac ochronie prawnej. Jezeli nie jest Pan/Pani zamierzonym adresatem powyzszej wiadomosci, nie moze jej Pan/Pani ujawniac, kopiowac, dystrybuowac, ani tez w zaden inny sposob udostepniac lub wykorzystywac. O blednym zaadresowaniu wiadomosci prosimy niezwlocznie poinformowac nadawce i usunac wiadomosc.********** This e-mail message may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.
_______________________________________________ Heka mailing list [email protected] https://mail.mozilla.org/listinfo/heka

