Hi Team,
I am trying to extract data between two annotated tags which can be present
in different lines and can have other tags in between them.
I have tried many approaches but none of them worked perfectly.
Sample Input :
Seller Name FirstAvenue Mortgage, Contact Name John
TN 12230 Contact Title Supervisor
Code :
TYPESYSTEM utils.PlainTextTypeSystem;
ENGINE utils.PlainTextAnnotator;
DECLARE Keyword (STRING label);
DECLARE Entry(Keyword keyword);
DECLARE Keyword SellerNameKeyword, SellerNameContextBlocker, ContactNameKeyword;
EXEC(PlainTextAnnotator, {Line,Paragraph});
ADDRETAINTYPE(WS);
Line{->TRIM(WS)};
Paragraph{->TRIM(WS)};
REMOVERETAINTYPE(WS);
"Seller Name" -> SellerNameKeyword ( "label" = "Seller Name");
"Contact Title" -> SellerNameContextBlocker("label" = "Seller Name
Context Blocker");
"Contact Name" -> ContactNameKeyword("label"= "Contact Name");
DECLARE Entity (STRING label, STRING value);
DECLARE Entity ContactName, SellerName;
BLOCK(line1) Line{CONTAINS(ContactNameKeyword)} {
ContactNameKeyword c:#{-PARTOF(ContactName)->
CREATE(ContactName,"label"="Contact Name", "value"=c.ct)};
}
SellerNameKeyword
c:#{-PARTOF(ContactNameKeyword),-PARTOF(SellerNameContextBlocker),-PARTOF(ContactName)
->
CREATE(SellerName,"label"="Seller Name", "value"=c.ct)}
SellerNameContextBlocker;
Output : FirstAvenue Mortgage, Contact Name John TN 12230
Expected Output : FirstAvenue Mortgage, TN 12230
Why -PARTOF tag is not working with #? I tried ANY+ but that didnt work too.
>
I think ANY tags all possible tokens while # tags everything. Do tell how
usage of # and ANY differs?
I have already posted query on
stackoverflow(https://stackoverflow.com/questions/58830986/get-text-between-two-annotated-tags-in-ruta)
<https://stackoverflow.com/questions/58830986/get-text-between-two-annotated-tags-in-ruta>
Please suggest how this could be achieved or any other better approach.
Thanks and Regards,
Shashank