Re: 0-day CVE in log4j

2021-12-10 Thread Nikita Amelchev
Hello.

The issue to update dependency was created:
https://issues.apache.org/jira/browse/IGNITE-16101

I want to include it to the 2.12 scope.

сб, 11 дек. 2021 г., 09:19 Raymond Wilson :

> All
>
> This blew up today: CVE-2021-44228 (
>
> https://www.bleepingcomputer.com/news/security/new-zero-day-exploit-for-log4j-java-library-is-an-enterprise-nightmare/
> )
>
> Will there be a risk assessment with respect to Ignite for this CVE?
>
> Thanks,
> Raymond.
>
> --
> 
> Raymond Wilson
> Trimble Distinguished Engineer, Civil Construction Software (CCS)
> 11 Birmingham Drive | Christchurch, New Zealand
> raymond_wil...@trimble.com
>
> <
> https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch
> >
>


0-day CVE in log4j

2021-12-10 Thread Raymond Wilson
All

This blew up today: CVE-2021-44228 (
https://www.bleepingcomputer.com/news/security/new-zero-day-exploit-for-log4j-java-library-is-an-enterprise-nightmare/
)

Will there be a risk assessment with respect to Ignite for this CVE?

Thanks,
Raymond.

-- 

Raymond Wilson
Trimble Distinguished Engineer, Civil Construction Software (CCS)
11 Birmingham Drive | Christchurch, New Zealand
raymond_wil...@trimble.com




[DISCUSSION] Validation key type and schema on a data insertion [IGNITE-16098]

2021-12-10 Thread Taras Ledkov
Hi, Igniters.

There was know issue described at the [IGNITE- 16098] case 1.
These issue is the cause of not symmetric cache API and SQL API:
we cannot create the same table by SQL and cache API (with se identical
internal structures).
In more detail: it is impossible now to create a sorted PK index with
unwrapped fields through cach eAPI.

I think it's more convenient when SQL and the cache API have equal
capabilities.

Also I found the second scenario that can break indexes.

Please take a look at the issue [1] description, and let me know what you
think.

[1]. https://issues.apache.org/jira/browse/IGNITE-16098


Re: [DISCUSSION] Reserve partitions for CacheQueries

2021-12-10 Thread Maksim Timonin
Hi!

> Existing users may already have some logic in place to handle
inconsistencies

Pavel, I'm not aware of such users but your comment makes sense. So, I' OK
with adding an option for ScanQuery. Naming of the option is debatable. The
name "reservePartitions" looks good, but we actually already reserve a
partition when it is explicitly specified in ScanQuery. Then, it can be a
bit misleading in the case of explicitly setting this param to `false` with
a specified partition. But we can just mention it in javadocs, that the
setting affects only full scan.

> AFAIR Scan and SQL queries implementations are totally different. Could
you tell me how Index queries fit there?

IndexQuery scans indexes like SQL does with few optimizations of traversing
BPlusTree, also there are some differences in query processing, also
IndexQuery provides sorted results by default. But I don't expect
inconsistency in results for SQL / IndexQuery for similar queries.
Actually, I should add tests for that and fix failures if any.

> In an ideal world it would be great to have only one API

I think it's possible to teach SQL to switch to cache queries for some
cases, or provide such an opportunity with hints or explicit functions. And
these queries might be parts of an SQL query plan. Or we can go even
deeper, maybe it could be like Apache Spark stages, when we can build our
plan with different types of queries, and the same type provides to
users the opportunity to run only specific stages: DataFrame.sql() or
DataFrame.filter(inline=true/false).

> Perhaps IndexQuery should also cover regular entry iteration cases

It's possible too. IndexQuery provides an opportunity to scan the PK index,
then it can start ScanQuery under the hood for some cases.

But anyway, to make them run through the single API, we should provide the
same guarantees.

Then, if there are no objections, the short-term plan is:
1. Implement partition reservation for IndexQuery.
2. Add the option for ScanQuery.
3. Make tests that show that IndexQuery / SQL / ScanQuery is replaceable
for some types of queries in terms of result consistency.
4. Then discuss again, how we can integrate them together.


On Fri, Dec 10, 2021 at 11:41 AM Ivan Pavlukhin  wrote:

> Actually I brought a point about using SQL queries instead of scan
> queries because I worry about inconsistency between different
> implementations. AFAIR Scan and SQL queries implementations are
> totally different. Could you tell me how Index queries fit there?
>
> My general ideas are as follows:
> 1. In an ideal world it would be great to have only one API (to rule
> them all) and implementation to cover all cases (i.e. scan, index,
> SQL). Also, I wonder how other vendors tackle this?
> 2. New IndexQuery might not mimic ScanQuery behavior but instead have
> a correct/expected one (including partition reservation). Perhaps
> IndexQuery should also cover regular entry iteration (scan) cases and
> become a new ground for generalized scan/index queries.
>
> 2021-12-09 17:08 GMT+03:00, Pavel Tupitsyn :
> > IndexQuery is experimental, so we can indeed make it reserve partitions
> by
> > default. No objections here.
> >
> > But with ScanQuery, I think it's better to add a "reservePartitions"
> > property so that default behavior is not affected.
> > Existing users may already have some logic in place to handle
> > inconsistencies. So if we enable reservation by default,
> > they'll get an unnecessary performance degradation.
> >
> > On Thu, Dec 9, 2021 at 4:42 PM Maksim Timonin 
> > wrote:
> >
> >> Hi, Ivan, Pavel! Thanks for your responses.
> >>
> >> > But is there a real need/benefit from using scan queries over
> primitive
> >> SQL queries today
> >>
> >> In addition to Pavel's response, I'd like to add about IndexQuery. This
> >> query is in experimental status, but it shows great performance and
> beats
> >> SQL for *some* type of queries. For IndexQuery we can leverage on
> >> understanding that we use only index and apply some optimizations: skip
> >> boundaries check (h2 Select.isConditionMet), apply filtering by inline
> IO
> >> instead of extracting fields from BinaryObjects, also we skip some H2
> >> related stuff. Also it's not implemented yet, but for IndexQuery we can
> >> access data pages sequentially instead of randomly (by iterating with
> >> IndexRow cursor by batches), and I expect it may make performance better
> >> in
> >> some highload cases.
> >>
> >> > it is desirable that successful (but may be strange) scenarios
> continue
> >> to work after fix (and not fail e.g. due to some introduced exception)
> >>
> >> > Partition reservation should be opt-in if we decide to proceed.
> >>
> >> With the approach I proposed we can reach it. Without specified query
> >> timeout (and cache queries don't have a public API for setting timeout),
> >> we
> >> can retry our attempts to reserve partitions while we do not succeed.
> So,
> >> users may get some increase in query time execution under unstable

Re: [DISCUSSION] Reserve partitions for CacheQueries

2021-12-10 Thread Ivan Pavlukhin
Actually I brought a point about using SQL queries instead of scan
queries because I worry about inconsistency between different
implementations. AFAIR Scan and SQL queries implementations are
totally different. Could you tell me how Index queries fit there?

My general ideas are as follows:
1. In an ideal world it would be great to have only one API (to rule
them all) and implementation to cover all cases (i.e. scan, index,
SQL). Also, I wonder how other vendors tackle this?
2. New IndexQuery might not mimic ScanQuery behavior but instead have
a correct/expected one (including partition reservation). Perhaps
IndexQuery should also cover regular entry iteration (scan) cases and
become a new ground for generalized scan/index queries.

2021-12-09 17:08 GMT+03:00, Pavel Tupitsyn :
> IndexQuery is experimental, so we can indeed make it reserve partitions by
> default. No objections here.
>
> But with ScanQuery, I think it's better to add a "reservePartitions"
> property so that default behavior is not affected.
> Existing users may already have some logic in place to handle
> inconsistencies. So if we enable reservation by default,
> they'll get an unnecessary performance degradation.
>
> On Thu, Dec 9, 2021 at 4:42 PM Maksim Timonin 
> wrote:
>
>> Hi, Ivan, Pavel! Thanks for your responses.
>>
>> > But is there a real need/benefit from using scan queries over primitive
>> SQL queries today
>>
>> In addition to Pavel's response, I'd like to add about IndexQuery. This
>> query is in experimental status, but it shows great performance and beats
>> SQL for *some* type of queries. For IndexQuery we can leverage on
>> understanding that we use only index and apply some optimizations: skip
>> boundaries check (h2 Select.isConditionMet), apply filtering by inline IO
>> instead of extracting fields from BinaryObjects, also we skip some H2
>> related stuff. Also it's not implemented yet, but for IndexQuery we can
>> access data pages sequentially instead of randomly (by iterating with
>> IndexRow cursor by batches), and I expect it may make performance better
>> in
>> some highload cases.
>>
>> > it is desirable that successful (but may be strange) scenarios continue
>> to work after fix (and not fail e.g. due to some introduced exception)
>>
>> > Partition reservation should be opt-in if we decide to proceed.
>>
>> With the approach I proposed we can reach it. Without specified query
>> timeout (and cache queries don't have a public API for setting timeout),
>> we
>> can retry our attempts to reserve partitions while we do not succeed. So,
>> users may get some increase in query time execution under unstable
>> topology. But in exchange users will get more stable results, and avoid
>> exceptions like in the ticket I showed [1].
>>
>> We can implement this logic for IndexQuery only (as it's only
>> experimental
>> since 2.12), and extend it on ScanQuery later after stabilizing with
>> IndexQuery. WDYT?
>>
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-12591
>>
>> On Thu, Dec 9, 2021 at 10:39 AM Pavel Tupitsyn 
>> wrote:
>>
>> > Agree with Ivan regarding compatibility.
>> > Partition reservation should be opt-in if we decide to proceed.
>> >
>> > > is there a real need/benefit from using
>> > > scan queries over primitive SQL queries today
>> >
>> > SQL requires additional configuration (QueryEntity) and involves memory
>> and
>> > CPU overhead to maintain the indexes.
>> > Especially if the query is used rarely (maybe some maintenance tasks),
>> scan
>> > is a better option.
>> >
>> > On Thu, Dec 9, 2021 at 10:12 AM Ivan Pavlukhin 
>> > wrote:
>> >
>> > > Hi Maksim,
>> > >
>> > > Thank you for looking into this. While fixing wrong/surprising
>> > > behavior is very important, I also have some concerns, let's say,
>> > > from
>> > > different angle of view:
>> > > 1. From a first glance it seems that similar behavior of scan and SQL
>> > > queries is a good idea. But is there a real need/benefit from using
>> > > scan queries over primitive SQL queries today (e.g. SELECT * FROM
>> > > table1)?
>> > > 2. Also we need to carefully think about compatibility. It is
>> > > desirable that successful (but may be strange) scenarios continue to
>> > > work after fix (and not fail e.g. due to some introduced exception).
>> > >
>> > > Regarding partition reservation for long-time open cursors I guess
>> > > some ideas might be found in LAZY mode for SQL queries [1].
>> > >
>> > > [1] https://issues.apache.org/jira/browse/IGNITE-9171
>> > >
>> > > 2021-12-08 20:11 GMT+03:00, Maksim Timonin :
>> > > > Hi, Igniters!
>> > > >
>> > > > There is a known issue that ScanQuery on unstable topology returns
>> > > > incorrect results: duplicates data [1] or fails with an exception
>> [2].
>> > > The
>> > > > reason is ScanQuery doesn't reserve partitions.
>> > > >
>> > > > IndexQuery shares the same query processing as ScanQuery, and then
>> > > > it
>> > is
>> > > > also affected by unstable topology. I want to fix it for
>> > > > I