That's what we had originally but we felt it was kind of hacky. On Tue, Aug 22, 2017 at 2:05 PM, Matt Burgess <[email protected]> wrote:
> If the order of operations doesn't matter then you could route flow files > that failed because the record existed back to an UpdateAttribute that sets > the statement.type to update and try them again. Not sure what additional > logic is involved (I'm not at my computer ATM) but that's where > PutDatabaseRecord offers an advantage. > > Regards, > Matt > > > On Aug 22, 2017, at 1:48 PM, Austin Duncan <[email protected]> > wrote: > > All of the records have the same schema though I think. It's all the same > kind of data and the same format. We were originally using a PutSql > processor and were having trouble with it. The current setup actually runs > pretty well im just trying to figure out how to do an insert and then if > the data already exists in the table via the rfidnumber do an update > instead. > > On Tue, Aug 22, 2017 at 12:05 PM, Matt Burgess <[email protected]> > wrote: > >> Your use case will be more complex, because you need to form SQL from >> your JSON. You can try ConvertJSONToSQL and then ReplaceText to add the ON >> CONFLICT to the end. >> >> Also if you are using SplitJson then PutDatabaseRecord won't be as >> efficient as it could be; it is meant to work on a number of records in a >> single flow file with the same schema. In your case you might be better off >> with PutSQL as it won't require schemas or statement types. The Split -> >> convert to SQL -> PutSQL pattern was the original way to do things, but for >> multiple records with the same schema, PutDatabaseRecord was added. >> >> Regards, >> Matt >> >> On Aug 22, 2017, at 11:43 AM, Austin Duncan <[email protected]> >> wrote: >> >> The example you listed sounds like a solution I am just having trouble >> fully understanding. I want to make my own sql so that i can do the "insert >> into on conflict" >> I am just having trouble really understanding what it is that I have to >> do so that the query and the data will be understood by the processor. Do I >> need different schemas? One like I sent and one for the query? >> >> On Tue, Aug 22, 2017 at 11:38 AM, Matt Burgess <[email protected]> >> wrote: >> >>> If your incoming data is already in fields (JSON, e.g.) and not a SQL >>> statement, then the statement.type should be "insert" rather than >>> "sql". The "sql" type is for passing in explicit SQL statements to >>> execute rather than taking a record and generating the appropriate SQL >>> statement from the fields and the statement.type. >>> >>> If you are trying to generate your own SQL and execute that, then >>> you'd want to try something like the solution I outlined before, where >>> you put the SQL in a field such as "statement", set statement.type to >>> "sql" and Field Containing SQL to "statement". >>> >>> Does that make sense? Or am I misunderstanding what you are trying to do? >>> >>> Regards, >>> Matt >>> >>> On Tue, Aug 22, 2017 at 11:32 AM, Austin Duncan >>> <[email protected]> wrote: >>> > Matt, >>> > >>> > I am using JsonPathReader and using the 'Schema Text' Property with a >>> schema >>> > defined in there. I could never figure out how to use any of the other >>> > Access Strategies. It's an inventory system so we are extracting the >>> Json >>> > and splitting it into a json file for each row. Then using this >>> schema: { >>> > "name": "insertSql", >>> > "type": "record", >>> > "fields": [ >>> > { >>> > "name": "RfidNumber", >>> > "type": "string" >>> > }, >>> > { >>> > "name": "CabinetName", >>> > "type": "string" >>> > }, >>> > { >>> > "name": "ItemNumber", >>> > "type": "string" >>> > }, >>> > { >>> > "name": "LotNumber", >>> > "type": "string" >>> > }, >>> > { >>> > "name": "PurchaseOrderNumber", >>> > "type": "string" >>> > }, >>> > { >>> > "name": "PurchaseOrderPrice", >>> > "type": "float" >>> > }, >>> > { >>> > "name": "SupplierId", >>> > "type": "string" >>> > }, >>> > { >>> > "name": "SupplierName", >>> > "type": "string" >>> > }, >>> > { >>> > "name": "updatedate", >>> > "type": "string" >>> > } >>> > ] >>> > } >>> > I inserted the data into the table. I am not 100% on the uses of >>> schemas so >>> > I am not quite sure what you mean by using a schema to define the >>> query. >>> > >>> > On Tue, Aug 22, 2017 at 11:23 AM, Matt Burgess <[email protected]> >>> wrote: >>> >> >>> >> Austin, >>> >> >>> >> What are you using for a record reader and schema for >>> >> PutDatabaseRecord? In order to execute SQL using PutDatabaseRecord, >>> >> you have to specify a "Field containing SQL", and the incoming >>> >> record(s) must have a field with that name. The value of that field >>> >> (for each record) will be executed. >>> >> >>> >> What I've done in the past is to put the whole SQL statement in a JSON >>> >> doc: {"query": "INSERT INTO table (column1 ,column2, column3, column4) >>> >> VALUES() ON CONFLICT (rfidnumber) DO UPDATE"} then I set Field >>> >> Containing SQL to "query", and use a JsonPathReader specifying a >>> >> "query" field with a path of $.query, or just a JsonTreeReader, either >>> >> reader using a schema with a single string field called "query". >>> >> >>> >> IMO an improvement would be nice to add a SQLReader that could split >>> >> on newlines or semicolons or whatever, and each "record" would contain >>> >> a SQL statement with the specified field name. That would save the >>> >> trouble of having to temporarily convert your SQL statement(s) into a >>> >> format that an existing Reader can recognize. >>> >> >>> >> Regards, >>> >> Matt >>> >> >>> >> >>> >> On Tue, Aug 22, 2017 at 11:12 AM, Austin Duncan >>> >> <[email protected]> wrote: >>> >> > In my flow I am pulling data from a Json, splitting the Json and >>> then >>> >> > inserting that into a postgres table using the putdatabaserecord >>> >> > processor. >>> >> > I have been using the insert statement option and it has been >>> working >>> >> > fine >>> >> > but now I am trying to figure out how to do a INSERT INTO table ON >>> >> > CONFLICT >>> >> > UPDATE statement. I have the statement.type attribute set to SQL >>> and am >>> >> > trying to do the query: >>> >> > >>> >> > INSERT INTO table (column1 ,column2, column3, column4) >>> >> > >>> >> > VALUES() >>> >> > >>> >> > ON CONFLICT (rfidnumber) DO UPDATE; >>> >> > >>> >> > I am getting the error 'Record schema does not contain filed >>> containing >>> >> > SQL'. So two th Any help would be appreciated. >>> >> > >>> >> > Thanks, >>> >> > >>> >> > Austin >>> > >>> > >>> >> >> >
