[ 
https://issues.apache.org/jira/browse/KUDU-1882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy updated KUDU-1882:
-----------------------------
    Description: 
The RegexpKuduOperationsProducer currently has the following configuration 
options that could be improved:


||Property Name || Default || Required? || Description ||
| {code}producer.skipMissingColumn{code} | {code}false{code} | No | What to do 
if a column in the Kudu table has no corresponding capture group. If set to 
true, a warning message is logged and the operation is still attempted. If set 
to false, an exception is thrown and the sink will not process the Event, 
causing a Flume Channel rollback.|
| {code}producer.skipBadColumnValue{code} | {code}false{code} | No | What to do 
if a value in the pattern match cannot be coerced to the required type. If set 
to true, a warning message is logged and the operation is still attempted. If 
set to false, an exception is thrown and the sink will not process the Event, 
causing a Flume Channel rollback. |
| {code}producer.warnUnmatchedRows{code} | {code}true{code} | No | Whether to 
log a warning about payloads that do not match the pattern. If set to false, 
event bodies with no matches will be silently dropped. |

It would be an improvement if each of these concepts had the the following 
options: {{warn}}, {{ignore}}, {{reject}}

Where {{warn}} would log a warning to the Flume log and continue processing, 
{{ignore}} would attempt to continue processing without issuing a warning, and 
{{reject}} would throw an exception.

It may be that some fields are nullable or have defaults, potentially due to an 
ALTER TABLE, and we don't want to fill the Flume logs with useless warnings. 
Users may also want to reject any Events that don't match their regex so they 
can correct the configuration and restart Flume without losing those Events.

  was:
The RegexpKuduOperationsProducer currently has the following configuration 
options that could be improved:


||Property Name || Default || Required? || Description ||
| {{producer.skipMissingColumn}} | {{false}} | No | What to do if a column in 
the Kudu table has no corresponding capture group. If set to true, a warning 
message is logged and the operation is still attempted. If set to false, an 
exception is thrown and the sink will not process the Event, causing a Flume 
Channel rollback.|
| {{producer.skipBadColumnValue}} | {{false}} | No | What to do if a value in 
the pattern match cannot be coerced to the required type. If set to true, a 
warning message is logged and the operation is still attempted. If set to 
false, an exception is thrown and the sink will not process the Event, causing 
a Flume Channel rollback. |
| {{producer.warnUnmatchedRows}} | {{true}} | No | Whether to log a warning 
about payloads that do not match the pattern. If set to false, event bodies 
with no matches will be silently dropped. |

It would be an improvement if each of these concepts had the the following 
options: {{warn}}, {{ignore}}, {{reject}}

Where {{warn}} would log a warning to the Flume log and continue processing, 
{{ignore}} would attempt to continue processing without issuing a warning, and 
{{reject}} would throw an exception.

It may be that some fields are nullable or have defaults, potentially due to an 
ALTER TABLE, and we don't want to fill the Flume logs with useless warnings. 
Users may also want to reject any Events that don't match their regex so they 
can correct the configuration and restart Flume without losing those Events.


> Configuration improvements for Flume Kudu Sink regexp operations producer
> -------------------------------------------------------------------------
>
>                 Key: KUDU-1882
>                 URL: https://issues.apache.org/jira/browse/KUDU-1882
>             Project: Kudu
>          Issue Type: Bug
>          Components: flume-sink, integration
>    Affects Versions: 1.2.0
>            Reporter: Mike Percy
>
> The RegexpKuduOperationsProducer currently has the following configuration 
> options that could be improved:
> ||Property Name || Default || Required? || Description ||
> | {code}producer.skipMissingColumn{code} | {code}false{code} | No | What to 
> do if a column in the Kudu table has no corresponding capture group. If set 
> to true, a warning message is logged and the operation is still attempted. If 
> set to false, an exception is thrown and the sink will not process the Event, 
> causing a Flume Channel rollback.|
> | {code}producer.skipBadColumnValue{code} | {code}false{code} | No | What to 
> do if a value in the pattern match cannot be coerced to the required type. If 
> set to true, a warning message is logged and the operation is still 
> attempted. If set to false, an exception is thrown and the sink will not 
> process the Event, causing a Flume Channel rollback. |
> | {code}producer.warnUnmatchedRows{code} | {code}true{code} | No | Whether to 
> log a warning about payloads that do not match the pattern. If set to false, 
> event bodies with no matches will be silently dropped. |
> It would be an improvement if each of these concepts had the the following 
> options: {{warn}}, {{ignore}}, {{reject}}
> Where {{warn}} would log a warning to the Flume log and continue processing, 
> {{ignore}} would attempt to continue processing without issuing a warning, 
> and {{reject}} would throw an exception.
> It may be that some fields are nullable or have defaults, potentially due to 
> an ALTER TABLE, and we don't want to fill the Flume logs with useless 
> warnings. Users may also want to reject any Events that don't match their 
> regex so they can correct the configuration and restart Flume without losing 
> those Events.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to