Hi Yolanda, Jeremy, Thanks for your useful samples. I will work with them to challenge a little more complex case.
Regards Stephane · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Stephane Tinseau iSuite Technical Specialist Thomson Reuters Phone: +33 1 47 62 67 72 [email protected]<mailto:[email protected]> thomsonreuters.com<http://thomsonreuters.com/> From: Jeremy Dyer [mailto:[email protected]] Sent: 31 August 2016 17:28 To: [email protected] Subject: Re: working with HTML table Stephan - Here is another option using just the GetHTMLElement without any ExecuteScript processor. This uses a CSS selector to pull the elements and then NiFi Expression Language to split and add the values. It isn't much different than what you had. You were very close. On Wed, Aug 31, 2016 at 10:06 AM, Yolanda Davis <[email protected]<mailto:[email protected]>> wrote: Hi Stephane, Here's something I hope can help. In the GetHTMLElement instead of doing the selector on "table td" try "table tr" with an output type of "Text" and a destination type of flowfile-content. This should create flow files for each row with data and extract the numeric text from the td elements in that data. From there you can use the ExecuteScript processor to trim the whitespace, convert the text values into numbers and sum them. I was able to get this to work with the javascript (ECMAScript) below and using the example html you provided: var flowFile = session.get(); if (flowFile != null) { var StreamCallback = Java.type("org.apache.nifi.processor.io.StreamCallback") var IOUtils = Java.type("org.apache.commons.io.IOUtils") var StandardCharsets = Java.type("java.nio.charset.StandardCharsets") flowFile = session.write(flowFile, new StreamCallback(function(inputStream, outputStream) { var text = IOUtils.toString(inputStream, StandardCharsets.UTF_8) var res = text.split(" "); var count = 0; for(i in res){ if(parseInt(res[i]) != NaN){ count+=parseInt(res[i]); } } outputStream.write(count.toString().getBytes(StandardCharsets.UTF_8)) })) flowFile = session.putAttribute(flowFile, "filename", flowFile.getId() + '_count.txt'); session.transfer(flowFile, REL_SUCCESS) } I've attached the template I used to do this which hopefully can help as well. Please let me know if you have any questions. Yolanda On Wed, Aug 31, 2016 at 3:52 AM, <[email protected]<mailto:[email protected]>> wrote: Hi All, I’m trying to extract and doing calculation from HTML table with NIFI. The purpose of the test if doing an addition of each TD in the same TR and output the result in file. For this sample the result should be 23 and 43. My table looks like <table> <tr> <td>11</td> <td>12</td> </tr> <tr> <td>21</td> <td>22</td> </tr> </table> My NIFI workflow is InvokeHTTP > Response > GetHTMLElement > Success > PutFile The CSS Selector for GetHTMLElement is table td. I know that GetHTMLElement produce 0-N element but I don’t know how I can perform calculation of them. All help will be grateful Thanks Regards Stephane · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · Stephane Tinseau Thomson Reuters [email protected]<mailto:[email protected]> thomsonreuters.com<http://thomsonreuters.com/> ________________________________ This e-mail is for the sole use of the intended recipient and contains information that may be privileged and/or confidential. If you are not an intended recipient, please notify the sender by return e-mail and delete this e-mail and any attachments. Certain required legal entity disclosures can be accessed on our website.<http://site.thomsonreuters.com/site/disclosures/> -- -- [email protected]<mailto:[email protected]> @YolandaMDavis
