On Wed, Mar 29, 2017 at 4:23 PM, Martin Mucha <[email protected]> wrote:
> Hi, > > I didn't want to step into this, but I was asked by dan to express my > opinion about this. I'm not pushing for change (since I believe it will be > blocked), I'm just trying to help us focus. We should have some questions > answered prior to saying what we want to do... > > Current state of accessing db is that there's fired lots of unnecessary db > calls, just because there isn't easy way to do them correctly, someone is > lazy, whatever, that isn't important. The blame was laid on invalid code > review. Now let me remember maintainer Kolesnik, who during CR *insisted* > on creating N+1 problem[1] when accessing db, to save ±10 lines of code in > dal layer. So Eli is right, our code suck because of improper code revision > process, but sometimes it is maintainers who wants invalid code to be > pushed to simplify db access code. Therefore we cannot rely on review > process, this is real issue we have. I'm fine with it, this is how we do > things. But we also have (sometimes (a lot)) degraded performance because > of it. > > We should have answered following questions: > > 1) is performance even important at all? Should we optimize db access? I'm > assuming "yes" should be the answer. If it's not the answer, we can skip > next questions. > > 2) can someone responsible for whole project provide timings and count of > queries? This should easily bring crosshair on commands requiring attention. > 2.1) customer is issuing synchronous command. What is maximum number of > queries this command (plus all transitive internal commands) should fire? > What's the number for asynchronous command? > 2.2) same for time; what is the maximum time for synchronous and > asynchronous command? > > 3) lets say that we found command which requires optimization (and that > should not be hard). If we go back to mentioned N+1 problem, it's rather > easy to avoid it for one association(yet we did not do it even there), but > it's not that easy for multiple joins [1]. But it's superbly easy to > achieve it using ORM tool, even with still using only named queries, even > native ones defined outside of java. Supposing we want to optimize found > command, what would dal maintainers consider as a acceptable optimization? > Is there even possibility to not use stored procedures? I believe saying > simple "no", if that's the case, is valid response, which can save time. If > there is theoretical possibility, what criteria must be met so that we can > consider replacing most trivial stored procedures with some tool in our > project? There are lots of tools/frameworks, based on your criteria we > might find some, which would work best... > > ————— > > My generic opinion would be to stick with sp and allowing to use something > less heavy for simplest queries/stuff. It could beJPA using ORM and > entities. Or named queries (still using entities) or nativequeries (which > does not use entities, but plain sql) and both ends up creating prepared > statements on server startup. Or we could use some lighter orm. But I'd > definitely avoid writing any new homebrewed approach, this isn't a sane > option. > In that case , if we will not use SP , we still will have to secure the data (for example , hidden columns for some users like in the VDS view) > > [1] example: Lets discuss query association A—>B—>C, 5xA, each A has 5B > etc. In our typical dao this will mean 1+5+5*5=31 queries. Same with > properly annotated ORM sums up to 1 or 3 queries. > > > M. > > On Mon, Mar 27, 2017 at 5:38 PM, Yevgeny Zaspitsky <[email protected]> > wrote: > >> >> >> On Mon, Mar 27, 2017 at 5:30 PM, Yevgeny Zaspitsky <[email protected]> >> wrote: >> >>> >>> >>> On Mon, Mar 27, 2017 at 12:45 PM, Eli Mesika <[email protected]> wrote: >>> >>>> >>>> >>>> On Mon, Mar 27, 2017 at 12:26 PM, Yevgeny Zaspitsky < >>>> [email protected]> wrote: >>>> >>>>> > I don't think that looking in SQL in Java code is clear than >>>>> looking in a SP code >>>>> >>>>> Looking in SQL isn't the problem I'm trying to solve, but finding that. >>>>> >>>>> > 1) You will pass more data on the wire instead of calling a SP with >>>>> parameters >>>>> >>>>> Passing the SQL string shouldn't be a problem unless that is very long >>>>> one (several K) and then we probably do something wrong. >>>>> >>>>> > 2) Your data that is passed on the wire is exposed to attacks since >>>>> you will have to implement DB security in the engine level (for example >>>>> hidden columns). >>>>> >>>>> IIUC currently querying tables/views with hidden columns is >>>>> implemented in a SP that consist of at least 2 SQL's, so those aren't good >>>>> candidates for my proposal and will stay as is. >>>>> BTW how other projects resolve the security problem? AFAIK usually >>>>> hidden columns are hidden by defining a proper view that do not include >>>>> those. >>>>> >>>> >>>> That's a bad solution as long as your data is passed unmasked on the >>>> wire >>>> >>> >>> In some cases we do send a sensitive info in the current solution that >>> uses SP's. >>> Should we use (maybe already) a kind of secured connection to DB? >>> >>> Database security should be done in the database level and you will not >>>> be able to do that without using SPs >>>> >>> >>> I'm not a DB security expert, but from my experience from my previous >>> jobs none of them used "SP's only" approach, in fact they didn't use SP's >>> at all. >>> >>>> >>>> >>>> >>>>> >>>>> > 3) Changes in the SQL code done in patches may be more complicated >>>>> to track >>>>> >>>>> Most of SQL code changes involve changes in a DAO too, so this >>>>> shouldn't be an issue. >>>>> >>>> It is !!! , changes in DAO are easy to track since it is a Java code , >>>> you are suggesting to write the SQL itself >>>> >>> >>> If you need to update a Java code together with SQL, then how much >>> difference is in seeing the change in a single file or in two separate ones? >>> Frankly speaking I do not get what do you mean by "track". If you're >>> about following what a DAO method does, then my approach simplifies the job. >>> >>>> >>>> >>>>> >>>>> > 4) SQL Injection >>>>> >>>>> Again, how other projects resolve the security problem? Internet is >>>>> full of articles/blogs of why stored procedures should be avoided ("avoid >>>>> >>>>> stored procedures" Google query returned ~6.6M results), so I guess >>>>> there are some other approaches for the security issue if that exists. >>>>> >>>> >>>> "use stored procedures" Google query returned About 4,840,000 results >>>> >>>> How many of those add "do not" to the search term? Anyhow avoid gives >>> more results... >>> >> >> Again, I'm not a DB security expert, but IIUC using prepared statements >> with parameters (what my patches do) doesn't allow a SQL injection. On the >> other side the existing SearchDao.getAllWithQuery implementors are exposed >> to SQL injection attempts. Am I right? >> >> >>>> >>>>> >>>>> > 5) I think that SP performs better than SQL inside Java >>>>> >>>>> Do we have a measurement of how much better? >>>>> One of my intentions for this proposal is that people evidently avoid >>>>> creating neat SQL queries and re-use the existing ones, which has much >>>>> bigger performance impact. I guess that the biggest limit here is how >>>>> complicated that procedure is in the engine code. >>>>> >>>> >>>> That's why we have code review process and we should nack such >>>> patches , so you think that if people are not familiar with Java8 syntax >>>> for example we should move this code to be performed by a "easier" >>>> mechanism ? >>>> If people are not doing the right thing , we have gerrit for that, we >>>> can comment , nack , whatever to make the code good >>>> >>> >>> In a perfect word you're right, proper reviews should've stop that. But >>> in the hard reality of oVirt code base it doesn't happen somehow... My >>> proposal bases on the current situation of oVirt code that was created by >>> using all Gerrit features. >>> >>>> >>>> >>>> >>>>> >>>>> >>>>> On Mon, Mar 27, 2017 at 11:09 AM, Eli Mesika <[email protected]> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Mon, Mar 27, 2017 at 1:56 AM, Moti Asayag <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Sun, Mar 26, 2017 at 5:12 PM, Yevgeny Zaspitsky < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hi All, >>>>>>>> >>>>>>>> Recently I had a task of performance improvement in one of our >>>>>>>> network related flows and had some hard time following our DAL code >>>>>>>> and one >>>>>>>> of the outcomes of the task was defining a couple of new quite simple, >>>>>>>> but >>>>>>>> neat queries. >>>>>>>> When I came to coding those new queries I remembered how hard was >>>>>>>> following the existing DAL code, I decided to make my own new methods >>>>>>>> clearer. So I created [1] and [2] patches. >>>>>>>> >>>>>>>> Everything is quite standard there beside the fact that they do not >>>>>>>> use any stored procedure, but use SQL directly, IMHO by that they save >>>>>>>> time >>>>>>>> that I spent in trying to follow what a DAO method does. Looking into >>>>>>>> the >>>>>>>> method code you get the understanding of what this method is all about: >>>>>>>> >>>>>>>> - no looking for a stored procedure name that is buried in the >>>>>>>> DAO class hierarchy >>>>>>>> - no looking for the SP definition >>>>>>>> >>>>>>>> >>>>>>> There are additional pros and cons for the suggestion to consider: >>>>>>> >>>>>>> Pros: >>>>>>> >>>>>>> 1. No need to run engine-setup after changing DB related code >>>>>>> (in case of SQL inside Java). >>>>>>> >>>>>>> Cons: >>>>>>> >>>>>>> 1. DAO files might become very long. >>>>>>> 2. If you decide to return the business entity associated with >>>>>>> the DAO as a returned object, you won't know as a caller which >>>>>>> fields to >>>>>>> expect to be populated, which lead to 3: >>>>>>> 3. An inflation of business entities to represent partial >>>>>>> populated business entity or inflation of mappers inflation (this >>>>>>> will be >>>>>>> required for SP as well). >>>>>>> 4. SQL code inside of Java: >>>>>>> 1. Beside of the fact that a multi-line concatenated string that >>>>>>> cannot be easily copied and run with psql, it means that we >>>>>>> should compile >>>>>>> the code in order to test the change (vs building with >>>>>>> DEV_REBUILD=0 which >>>>>>> only package the SQL file). >>>>>>> 2. No syntax highlighting when performing code review. i.e. I >>>>>>> don't think reviewing a patch such as >>>>>>> https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/ne >>>>>>> twork_sp.sql >>>>>>> >>>>>>> <https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/network_sp.sql> >>>>>>> would be more clear inside a java file. >>>>>>> 3. The user permissions management is implemented on DB >>>>>>> level. That means that SQL will be more complex (more >>>>>>> concatenated java >>>>>>> lines). >>>>>>> 5. Stored procedure are cached by project's code. See >>>>>>> SimpleJdbcCallsHandler.getCall(), while the >>>>>>> NamedParameterJdbcTemplate are cached by spring's code which seems >>>>>>> less >>>>>>> optimal (sync all calls using synchronized vs using >>>>>>> ConcurrentHashMap as in >>>>>>> SP cache). >>>>>>> 6. With the NamedParametersJdbcTemplate there is no use of the >>>>>>> DbEngineDialect. What's the impact of it ? >>>>>>> >>>>>>> So besides 5 and 6, the rest is a matter of style. I'd like to hear >>>>>>> more opinions from other members. >>>>>>> >>>>>> >>>>>> I agree with all you wrote Moti >>>>>> I don't think that looking in SQL in Java code is clear than looking >>>>>> in a SP code >>>>>> Additionally >>>>>> 1) You will pass more data on the wire instead of calling a SP with >>>>>> parameters >>>>>> 2) Your data that is passed on the wire is exposed to attacks since >>>>>> you will have to implement DB security in the engine level (for example >>>>>> hidden columns) >>>>>> 3) Changes in the SQL code done in patches may be more complicated to >>>>>> track >>>>>> 4) SQL Injection >>>>>> 5) I think that SP performs better than SQL inside Java >>>>>> >>>>>> I see no real reason to replace the SPs with SQL code , SP is just a >>>>>> container for SQL code >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Regards, >>>>>>> Moti >>>>>>> >>>>>>> So I'd like to propose moving towards this approach in general in >>>>>>>> all cases when nothing beyond a simple SQL is needed (no stored >>>>>>>> procedure >>>>>>>> programming language is needed). >>>>>>>> From my experience with the performance improvement task it looks >>>>>>>> like people avoid adding new queries for a specific need of a flow, >>>>>>>> instead >>>>>>>> they use the existing general ones (e.g. dao.getAllForX()) and do the >>>>>>>> actual join in the bll code. >>>>>>>> IMHO the proposed approach would simplify adding new specific >>>>>>>> queries and by that would prevent a decent part of performance issues >>>>>>>> in >>>>>>>> the future. >>>>>>>> >>>>>>>> I do not propose changing all existing SP's to inline queries in a >>>>>>>> once, but to allow adding new queries this way. Also we might consider >>>>>>>> converting relatively simple SP's to inline SQL statements later in a >>>>>>>> graduate way. >>>>>>>> >>>>>>>> [1] - https://gerrit.ovirt.org/#/c/74456 >>>>>>>> [2] - https://gerrit.ovirt.org/#/c/74458 >>>>>>>> >>>>>>>> Regards, >>>>>>>> Yevgeny >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Devel mailing list >>>>>>>> [email protected] >>>>>>>> http://lists.ovirt.org/mailman/listinfo/devel >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Regards, >>>>>>> Moti >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Devel mailing list >>>>>>> [email protected] >>>>>>> http://lists.ovirt.org/mailman/listinfo/devel >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >> _______________________________________________ >> Devel mailing list >> [email protected] >> http://lists.ovirt.org/mailman/listinfo/devel >> > >
_______________________________________________ Devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/devel
