No problem Sven. Just curious which version do you have? If I recall correctly i believe it was in as early a version as 0.5.1
On Thu, Apr 13, 2017 at 10:32 AM, Sven Davison <[email protected]> wrote: > thanks for the ideas guys! For reasons beyond my control, I can't update > to the newest nifi to get the GetHTML processor @ this time. Maybe some > day. I'll look into the ExecuteScript or and SplitText more. > > On Tue, Apr 11, 2017 at 1:14 PM, Jeremy Dyer <[email protected]> wrote: > >> Sven, >> >> There is also the GetHTML processor I added awhile back. If the input is >> valid HTML you should always be able to use a CSS selector to extract that >> HTML value. If you can provide a sample of the HTML I would be glad to make >> a flow for you doing so as an example >> >> Jeremy >> >> Sent from my iPhone >> >> On Apr 11, 2017, at 1:01 PM, Andy LoPresto <[email protected]> wrote: >> >> Sven, >> >> Currently I would recommend using ExecuteScript and simply streaming & >> slicing the content bytes at line 10 (a one-line operation in Groovy, I >> believe the same in Ruby and Python). >> >> This isn’t the first time I’ve heard of a similar request though, so I >> think if you were to open a Jira requesting a “GetLine(s)” or “SliceText” >> processor, it could be valuable to the community. The current component >> solution would probably involve SplitText/SplitContent and as you said, >> decent overhead, especially if the desired content is early in the >> flowfile. >> >> Andy LoPresto >> [email protected] >> *[email protected] <[email protected]>* >> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69 >> >> On Apr 11, 2017, at 9:38 AM, Sven Davison <[email protected]> wrote: >> >> I'm looking to parse some HTML. It's not the cleanest but i know that my >> content is always on line 10 of the file. I could use splittext then >> compare it to ensure it starts with XYZBeginningString, i supose.. but i'm >> looking for something w/ less overhead. Especially knowing the content is >> always on line 10. >> >> Anyone have other/cleaner ideas on how to get the content of line 10? >> >> >> >
