I'm trying to generate a serialized object using the flatfile_summarizer and
I'm having some difficulty...
I'm trying to take a list of RegEx's in a text file (one regex per line), and
load it with the following extractor:
{
"config" : {
"columns" : {
"regex" : 0
},
"value_filter" : "LENGTH(regex) > 0",
"state_init" : "SET_INIT()",
"state_update" : {
"state" : "SET_ADD(state,regex)"
},
"state_merge" : "SET_MERGE(states)",
"separator" : ","
},
"extractor" : "CSV"
}
Running the tool, as follows:
/usr/hcp/current/metron/bin/flatfile_summarizer.sh -i ./regex.txt -o regex.ser
-e regex_extractor.json
I end up with the following error message:
Exception in thread "main" java.lang.NullPointerException
at
org.apache.metron.dataloads.nonbulk.flatfile.writer.LocalWriter.write(LocalWriter.java:45)
at
org.apache.metron.dataloads.nonbulk.flatfile.writer.Writers.write(Writers.java:54)
at
org.apache.metron.dataloads.nonbulk.flatfile.writer.Writer.write(Writer.java:30)
at
org.apache.metron.dataloads.nonbulk.flatfile.importer.LocalSummarizer.importData(LocalSummarizer.java:136)
at
org.apache.metron.dataloads.nonbulk.flatfile.SimpleFlatFileSummarizer.main(SimpleFlatFileSummarizer.java:51)
at
org.apache.metron.dataloads.nonbulk.flatfile.SimpleFlatFileSummarizer.main(SimpleFlatFileSummarizer.java:38)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Am I doing something wrong? Also, is there a better alternative to the "CSV"
extractor? I'm ideally looking to load the entire line, regardless of any
specific characters (regex may contain commas for example).
Thanks in advance,
David Auclair