Yes, the values are correct. Attribute has value which is expected to be. i.e. for date 181231 in filename I get value 18231 for attribute week_extracted which is extracted from filename with split method.
Tom. On Wed, 30 Jan 2019 at 10:59, Arpad Boda <ab...@hortonworks.com> wrote: > Hi Tom, > > > > “that is exactly what I tried and date_final or date_file are applied to > the attribute of outgoing flowfile, it works.” > > > > It works as they are strings, so not working would be a surprise. The > question is: what are their values? 😊 > > > > Regards, > > Arpad > > > > *From: *Tomislav Novosel <to.novo...@gmail.com> > *Reply-To: *"users@nifi.apache.org" <users@nifi.apache.org> > *Date: *Wednesday, 30 January 2019 at 10:53 > *To: *"users@nifi.apache.org" <users@nifi.apache.org> > *Subject: *Re: Modify Flowfile attributes > > > > Hi Arpad, > > > > that is exactly what I tried and date_final or date_file are applied to > the attribute of outgoing flowfile, it works. > > But if I put to attribute week_att, there is error: week_att cannot be > coerced as String, and if I put str_week it gives me week number 44. > > > > Tom > > > > On Wed, 30 Jan 2019 at 08:40, Arpad Boda <ab...@hortonworks.com> wrote: > > Tom, > > > > The Python code to get the week number for a datetime string seems to be > correct. > > > > To help debugging could you stamp your “date_final” or “date_file” > variable to an attribute, so we could see what’s the input? > > My gut feeling says there is some parsing magic going wrong here. > > > > Regards, > > Arpad > > > > *From: *Tomislav Novosel <to.novo...@gmail.com> > *Reply-To: *"users@nifi.apache.org" <users@nifi.apache.org> > *Date: *Tuesday, 29 January 2019 at 20:13 > *To: *"users@nifi.apache.org" <users@nifi.apache.org> > *Subject: *Re: Modify Flowfile attributes > > > > With following script I get week number 44 and year 118, which is strange > result. > Week should be 1 and year 2019 for date 2018-31-12. > > What is wrong here? > > > > Tom > > > > from datetime import datetime, timedelta, date > > > > flowFile = session.get() > > if (flowFile != None): > > file_name = flowFile.getAttribute('filename') > > > > date_file = file_name.split("_")[6] > > date_final = date_file.split(".")[0] > > date_obj = datetime.strptime(date_final,'%y%m%d') > > date_year = date_obj.year > > date_day = date_obj.day > > date_month = date_obj.month > > > > week_att = date(year=date_year, month=date_month, > day=date_day).isocalendar()[1] > > year_att = date(year=date_year, month=date_month, > day=date_day).isocalendar()[0] > > str_week = str(week_att) > > str_year = str(year_att) > > > > flowFile = session.putAttribute(flowFile, "year_extracted", str_year) > > flowFile = session.putAttribute(flowFile, "week_extracted", str_week) > > session.transfer(flowFile, REL_SUCCESS) > > session.commit() > > > > On Tue, 29 Jan 2019 at 16:59, Tomislav Novosel <to.novo...@gmail.com> > wrote: > > Thank you all for answers. The reason why I want this to do with python > script is wrong calculation of week number from date. Nifi has that > function in expression lang. (extracted_date:format("w", <<time_zone>>)). > My time zone is GMT+2. > > If i set date, for example 20180819, and time zone in function GMT I get > week number 34, which is wrong. If I ommit time zone, I get week number 33, > which is right. I'm not sure if thats bug. You can test it for yourself, > and if you do, please share your findings here, maybe I'm doing something > wrong. > > > > On the other side, if I use python, I'more sure that I will get correct > week number, even for dates which overlaps with week number in next > year(e.g. 20181231) > > > > Since this calc will be in production, I need resilient workflow in the > future without errors. > > > > Regarding script I sent above, I'm getting error: "week cannot bo coerced > as string". I checked right on the beginning if the session is null or not. > > > > On Tue, 29 Jan 2019, 16:26 Jerry Vinokurov <grapesmo...@gmail.com wrote: > > I wanted to add, since I've done this specific operation many times, that > you can really just do this via the NiFi expression language, which I think > is more "idiomatic" than having ExecuteScript processors all over the > place. Basically, you would have an UpdateAttribute that set something > called, say, date_extracted with an expression that looks something like > ${filename:substringAfterLast('_'):toDate('yyyy.MM.dd')} (this is an > approximation based on the above, modify as necessary for your purpose). > Then you could use a second UpdateAttribute to extract various information > from this date with the format command, e.g. ${date_extracted:format('<your > format expression here>')}. I don't think there's one for "week" but in > general this is the approach I take when I need to do date munging. > > > > On Tue, Jan 29, 2019 at 10:06 AM Tomislav Novosel <to.novo...@gmail.com> > wrote: > > Hi Matt, thanks for suggestions. But performance is not crucial here. > > This is code i tried. but I get error: "AttributeError: 'NoneType' object > has no attribute 'getAttribute' at line number 4" > > If I remove code from line 6 to line 14, it works with some default > attribute values for year_extracted and week_extracted, otherwise i get > > error form above. > > > > Tom > > > > from datetime import datetime, timedelta, date > > > > flowFile = session.get() > > file_name = flowFile.getAttribute('filename') > > > > date_file = file_name.split("_")[6] > > date_final = date_file.split(".")[0] > > date_obj = datetime.strptime(date_final,'%y%m%d') > > date_year = date_obj.year > > date_day = date_obj.day > > date_month = date_obj.month > > > > week = date(year=date_year, month=date_month, day=date_day).isocalendar()[ > 1] > > year = date(year=date_year, month=date_month, day=date_day).isocalendar()[ > 0] > > > > if (flowFile != None): > > flowFile = session.putAttribute(flowFile, "year_extracted", year) > > flowFile = session.putAttribute(flowFile, "week_extracted", week) > > session.transfer(flowFile, REL_SUCCESS) > > session.commit() > > > > On Tue, 29 Jan 2019 at 15:53, Matt Burgess <mattyb...@apache.org> wrote: > > Tom, > > Keep in mind that you are using Jython not Python, which I mention > only to point out that it is *much* slower than the native Java > processors such as UpdateAttribute, and slower than other scripting > engines such as Groovy or Javascript/Nashorn. > > If performance/throughput is not a concern and you're more comfortable > with Jython, then Jerry's suggestion of session.putAttribute(flowFile, > attributeName, attributeValue) should do the trick. Note that if you > are adding more than a couple attributes, it's probably better to > create a dictionary (eventually/actually, a Java Map<String,String>) > of attribute name/value pairs, and use putAllAttributes(flowFile, > attributes) instead, as it is more performant. > > Regards, > Matt > > On Tue, Jan 29, 2019 at 9:25 AM Tomislav Novosel <to.novo...@gmail.com> > wrote: > > > > Thanks for the answer. > > > > Yes I know I can handle that with Expression language and > UpdateAttribute processor, but this is specific case on my work and I think > Python > > is better and more simple solution. I need to calc that with python > script. > > > > Tom > > > > On Tue, 29 Jan 2019 at 15:18, John McGinn <amruginn-n...@yahoo.com> > wrote: > >> > >> Since you're script shows that "filename" is an attribute of your > flowfile, you could use the UpdateAttribute processor. > >> > >> If you right click on UpdateAttribute and choose ShowUsage, then choose > Expression Language Guide, it shows you the things you can handle. > >> > >> Something along the lines of ${filename:getDelimitedField(6,'_')}, if I > understand the Groovy code correctly. I did a GenerateFlowFIle to an > UpdateAttribute processor setting filename to "1_2_3_4_5_6.2_abc", then > sent that to another UpdateAttribute with the getDelimitedField() I listed > and I received 6.2. Then another UpdateAttribute could parse the 6.2 for > the second substring, or you might be able to chain them in the existing > UpdateProcessor. > >> > >> > >> -------------------------------------------- > >> On Tue, 1/29/19, Tomislav Novosel <to.novo...@gmail.com> wrote: > >> > >> Subject: Modify Flowfile attributes > >> To: users@nifi.apache.org > >> Date: Tuesday, January 29, 2019, 9:04 AM > >> > >> Hi all, > >> I'm trying to calculate week number and date > >> from filename using ExecuteScript processor and Jython. Here > >> is python script.How can I add calculated > >> attributes week and year to flowfile? > >> Please help, thank you.Tom > >> P.S. Maybe I completely missed with this script. > >> Feel free to correct me. > >> > >> import > >> jsonimport java.iofrom org.apache.commons.io import > >> IOUtilsfrom java.nio.charset import > >> StandardCharsetsfrom org.apache.nifi.processor.io import > >> StreamCallbackfrom datetime import datetime, timedelta, date > >> class PyStreamCallback(StreamCallback): > >> def __init__(self, flowfile): > >> self.ff = flowfile > >> pass > >> def process(self, inputStream, outputStream): > >> file_name = > >> self.ff.getAttribute("filename") > >> date_file = > >> file_name.split("_")[6] > >> date_final = > >> date_file.split(".")[0] > >> date_obj = > >> datetime.strptime(date_final,'%y%m%d') > >> date_year = > >> date_obj.year > >> date_day = > >> date_obj.day > >> date_month = > >> date_obj.month > >> week = date(year=date_year, month=date_month, > day=date_day).isocalendar()[1] > >> year = > >> date(year=date_year, month=date_month, day=date_day).isocalendar()[0] > >> flowFile = > >> session.get()if (flowFile != None): > >> session.transfer(flowFile, REL_SUCCESS) > >> session.commit() > > > > -- > > http://www.google.com/profiles/grapesmoker > >