[
https://issues.apache.org/jira/browse/TRAFODION-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812592#comment-15812592
]
Sandhya Sundaresan commented on TRAFODION-1562:
-----------------------------------------------
The VSBB operator into the base table in this case is restricted by the fact
that it has inlined actions and it can produce outputs. The suggestion is to
enhance the upsert operator to be allowed to use VSBB even if outputs are
produced.
Capturing the email discussions here :
----------------------------------------------------------
From: Selva Govindarajan
Sent: Monday, January 9, 2017 4:58 AM
To: Suresh Subbiah <[email protected]>; Sandhya Sundaresan
<[email protected]>
Subject: Re: VSBB upsert issue for base table upsert...
Thanks Suresh for the detailed explanation. Now it makes sense to me. Yes.
there is no return row concept for UPDATE/UPSERT/DELETE.
As you said before in this thread, the query in question needs to be checked if
it needs to produce outputs. If it produces output unnecessarily it needs to be
fixed. We can also work on the concept of return row concept even in needs to
produces output to improve performance.
Selva
________________________________________
From: Suresh Subbiah
Sent: Saturday, January 7, 2017 9:41:50 AM
To: Selva Govindarajan; Sandhya Sundaresan
Subject: RE: VSBB upsert issue for base table upsert...
Hi Selva,
I hope I am looking at the correct method. I don’t see anything in
ExHbaseAccessUpsertVsbbSQTcb::work() that will copy a subset of columns from
newly inserted row into the up_queue entry. In Seaquest days, there used to be
a project expression that did this.
Now, ExHbaseAccessSQRowsetTcb::work() does have a RETURN_ROW state. From a
quick read it looks like this is called only for select and not for
update/delete.
As you said before, it maybe a quick extension to create the up-queue row for
upserts. The row after all does not come from HBase, we have it the work
method, so we should be able to associate a returned row, with each input row
synthesized.
Thanks
Suresh
From: Selva Govindarajan
Sent: Saturday, January 7, 2017 8:47 AM
To: Suresh Subbiah <[email protected]>; Sandhya Sundaresan
<[email protected]>
Subject: Re: VSBB upsert issue for base table upsert...
Hi Suresh,
It is not clear to me why upsert can't be VSBB even if it produces output. Can
you please elaborate? Isn't the output produced a part of the tuple of the
queue entry. Anoop also mentioned that it can't be VSBB if it produces output
for delete when I showed it to him.
Selva
________________________________________
From: Suresh Subbiah
Sent: Friday, January 6, 2017 2:36 PM
To: Sandhya Sundaresan
Cc: Selva Govindarajan
Subject: RE: VSBB upsert issue for base table upsert...
Hi Sandhya,
My previous response was not correct. Your debugging leads me to think that
code is correct as is. Do you think this reasoning is sound?
1) If an upsert operator is placing values in this up queue (i.e.
producing outputs) then it cannot be implemented using vsbb
2) Upsert will need to produce outputs when (a) there is syskey column (b)
expressions that are evaluated in the upsert operator (typically these are
expressions that produce the value to be upserted, if they not directly the
same as what is produced by source table) (c) identity values (doubtful about
this)
3) A typical value that is available directly from source side, need not
be “produced” from the upsert oprator. We already have it available from source
side.
Currently producesOutput() method and even runtime code in upsert operator may
not care about item 3) above. We may be “producing” output regardless of
whether it is used upstream or not. So if we can make sure that no output is
actually produced when conditions for item 3) are fully satisfied and we don’t
have anything in the UPSERT that will trigger condition 2), then a VSBB
operator could be used for upsert into the base table.
All this seems rather tricky to do and is liable to break. It may be easier to
enhance UPSERT to be VSBB even outputs need to be produced. Selva had some
ideas on that.
Thanks
Suresh
From: Sandhya Sundaresan
Sent: Friday, January 6, 2017 2:01 PM
To: Suresh Subbiah <[email protected]>
Cc: Selva Govindarajan <[email protected]>
Subject: RE: VSBB upsert issue for base table upsert...
Thanks Suresh. I am ccing Selva too.
So what I can do is rework this restriction to remove the “inlinedAction
&&producesOutputs() s” from the check and see if it causes any problems.
In this case both flags are TRUE. I guess ProducesOutputs is TRUE since this
row flows to the IM tree ? So the base table insert gets converted to a
simple_insert.
(gdb) frame 1
#1 0x00007fe471fe1be3 in HbaseInsert::preCodeGen (this=0x7fe4562bbc38,
generator=0x7ffe69cf05e0, externalInputs=..., pulledNewInputs=...)
at ../generator/GenPreCode.cpp:5488
5488 setInsertType(Insert::SIMPLE_INSERT);
(gdb) p inlinedActions
$17 = 1
(gdb) p producesOutput
No symbol "producesOutput" in current context.
(gdb) p producesOutputs()
$18 = 1
(gdb)
Sandhya
From: Suresh Subbiah
Sent: Friday, January 6, 2017 9:15 AM
To: Sandhya Sundaresan <[email protected]>
Subject: RE: VSBB upsert issue for base table upsert...
Hi Sandhya,
Sorry for the delayed response. Selva had shown me that the choice to use vsbb
is now in the generator as you have found.
I too have no explanation as to why inlinedActions requires an upsert to be
SIMPLE (i.e. non-VSBB). My guess is that this is left over code from before we
changed upsert with inlined actions to be a merge (or your new ‘EFF’ tree). In
those days, if we had an upsert with inlinedActions I suppose it produced
outputs and that we know vsbb cannot support.
If Selva agrees, my suggestion will be to take out he inlinedActions check here.
Thanks
Suresh
From: Sandhya Sundaresan
Sent: Friday, January 6, 2017 1:43 AM
To: Suresh Subbiah <[email protected]>
Subject: RE: VSBB upsert issue for base table upsert...
Hi Suresh,
I debugged this some more and found that it’s getting reset to SIMPLE_INSERT
and hence a regular upsert as opposed to vsbb_upsert in this part in preCodeGen:
I suppose it’s because inlinedActions is TRUE ?
GenPreCode.cpp :
if ((isUpsert()) &&
((getInsertType() == Insert::VSBB_INSERT_USER) ||
(getInsertType() == Insert::UPSERT_LOAD)))
{
if ((inlinedActions || producesOutputs())&& !getIsTrafLoadPrep())
setInsertType(Insert::SIMPLE_INSERT);
}
Not sure if it’s safe to put in a special case for this “efficient” tree. It’s
not clear to me why we cannot have vsbb rowsets if there are inlinedActions
Sandhya
> Changes in query tree when the upsert command is transformed into merge
> -----------------------------------------------------------------------
>
> Key: TRAFODION-1562
> URL: https://issues.apache.org/jira/browse/TRAFODION-1562
> Project: Apache Trafodion
> Issue Type: Sub-task
> Components: sql-cmp
> Reporter: Selvaganesan Govindarajan
> Assignee: Sandhya Sundaresan
> Attachments: BatchUpsertTransformation.pdf
>
>
> to improve the performance as explained in the main JIRA
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)