could someone guide on this one

Regards
Manik Singla
+91-9996008893
+91-9665639677

"Life doesn't consist in holding good cards but playing those you hold
well."


On Tue, Jun 11, 2019 at 5:58 PM Manik Singla <smanik...@gmail.com> wrote:

> Hey Team
>
> I have started using parquet recently.
>
> Kind of data I save is something like
>
> *raw   hostname cluster serviceName  *
>
> where raw is actual log lines.
>
> For raw, dictionary doesn't work as we no 2 log lines are same. But if we
> tokenise terms in dictionary, then dictionary can help here to filter out
> unwanted rows.  For example, parquet is a columnar format will become
> "parquet", "is", "a", "columnar", "format".
>
> Also, I see mention of merging bloomfilter not sure if we considering
> tokenisation there.
>
> Do we support some out of box to way to tokenise text before dictionary
>
> Also, what are your views if we think to add it
>
> Regards
> Manik Singla
> +91-9996008893
> +91-9665639677
>
> "Life doesn't consist in holding good cards but playing those you hold
> well."
>

Reply via email to