Em 27-04-2012 12:57, Jeremy Evans escreveu:
On Friday, April 27, 2012 5:05:21 AM UTC-7, Rodrigo Rosenfeld Rosas wrote:

    Of course Rails console will have to delegate to IRB, Pry or other
    interactive shell at some point. But none of them (including
    bin/sequel) will load the Rails environment alone. For example,
    you won't have access to the 'helper' method (or variable, not
    sure) inside a normal IRB or sequel session. Also, they won't run
    the initializers, etc.

    So, I'd have to create my own console. This is not a great thing
    to do. For example, if I wanted to create a gem for integrating
    Sequel to Rails, this wouldn't be the right approach to take.
    Currently I would depend on asking the Rails core team to add a
    hook that would allow me to call "Console.start" inside a block
    and they accept it.


Well, the current way Rails does it has some bad corner cases. In addition to the threading issues (discussed below): what if your app uses more than one database? From the looks of the Rails sandbox, only the default database is sandboxed.

Yeah, you're right about that.

    Another approach would be creating another command like "rails
    sequel-console" aliased as "rails sc", but that is clearly a
    desperate solution.

    But I guess there is another reason why ActiveRecord preferred the
    at_exit approach. I guess that block will only run after all
    threads are finished running. So, even if you start some thread in
    the console and changed the database from there it would still
    rollback all changes. Particularly I would be fine not to support
    such edge case, but maybe I won't be able to convince them to add
    such a hook. I'll see.


I'm guessing you are wrong about that. Starting a new thread would probably give you a new connection which would not be inside a transaction. I'll leave it up to someone else to test that theory.

Yes, that makes sense. I would also bet that their connection pool is based on Thread.current as well.

The correct way to do such a sandbox in Sequel would be to use after_connect to ensure that all new connections were inside a transaction. This would not be difficult to add via a Sequel extension, and could work without changes to Rails (assuming the appropriate Railtie).

I can get the sandbox information through "Rails.application.sandbox?". Could you give an example on how to use this Sequel's "after_connect" hook to ensure it will be inside a transaction?

    Each spec is already running on its own transaction. I'm just
    trying to get the before(:all) code to also run inside the
    outer-most transaction.


What you proposed was using a transaction around multiple specs, and having each spec run in it's own savepoint. That is not the same as running each spec in its own transaction. Savepoints are not the same as transactions.

Ah, ok, that is because the PostgreSQL documentation says it supports nested transactions through savepoints, so I call it that way :)

    I guess RSpec don't implement the around(:all) because maybe it
    will run the after(:all) inside an at_exit block maybe for
    preventing running threads side effects, but I'm just guessing. I
    didn't look at RSpec code yet.


I haven't looked either. My expectation would be that around(:all) would run only around the all of the related specs (similar to: run_before(:all); run_all_related_specs; ensure run_after(:all)).

    Note I think that may be a general problem with your idea of
    using a transaction around all the specs and a savepoint for each
    individual spec.

    Could you please elaborate on that? Would you suggest me another
    approach (rather than DatabaseCleaner.truncate) for avoiding
    creating the same records all over again for usage in some examples?


If there is a disconnection inside a spec, the connection would be removed from the pool. The next time a query was issued, a new connection would be created, and this wouldn't be inside a transaction block. If you use the block-based API, you don't need to worry about this, as it handles the situation correctly.

Oh, I see, I haven't thought about this possibility, you're right. I guess managing the connections by myself is really more complicated than I initially thought of. I'll try to figure out another way of supporting those RSpec and Rails sandbox cases when I get some time.

    I have already explained you the reasons why I think transactions
    and savepoints would be a better option. It would be much faster
    and I could still have some unchanged data always set in the
    database instead of recreating them in my specs.


Your suggestion is a hack to try to increase performance at the expense of reliability. Running each spec in its own transaction is far more reliable. Either recreate the unchanged data inside the transaction, or load the unchanged data before the transaction and delete it afterward.

Okay, I'm still looking for a faster way of doing this. Maybe I should take yet another path, but currently I have no hint what path it would be.

        I think it is a mistake to try to hide complexity from
        developers. They should be given the choice and should be
        able to understand the pros and contras of each approach and
        then make the decision. You're not responsible for other's
        mistake.


    You are entitled to your own opinion. However, in my opinion much
    of good programming is hiding the underlying complexity from
    developers.

    Yeah, but this rules applies to my situation as well. It would
    much less complex for me to just call transaction.begin and
    transaction.rollback instead of having to dig into Sequel's source
    code to call private methods for doing what I want to. This is
    much more error prone and it makes it much more difficult for
    other developers to understand it. Just like if other developers
    are required to run "rails sc" instead of "rails c" if they want
    to use a sandboxed version.


Well, like I mentioned above, there is a way to implement a correct sandbox in Sequel.

    So, when you hide complexity by limiting the usage to the
    most-common form you will be increasing complexity for other less
    common usage.


I'm OK with that, especially since the less common usage in this case is a usage case I don't want to encourage.

    This is not a matter of being able or not. Other developers that
    looked at my code should also be able to understand the code
    intention taking a glance at it. Also, working with public API is
    much more future proof in the sense that it is less likely that a
    Sequel upgrade would break our hack.


I don't think the Sequel code I gave you would be misunderstood in terms of intentions.

It is true that a public API would be more future proof, but that would encourage people to rely on it, which I don't want people to do.

    And JRuby was exactly what I had to resort at that time when I
    needed to use my full CPU power for a processing intensive task
    that took about 2 hours to complete. It would take about 12 hours
    to complete on MRI Ruby. If you're curious for more details:

    
http://rosenfeld.herokuapp.com/en/articles/ruby-rails/2012-03-04-how-nokogiri-and-jruby-saved-my-week
    
<http://rosenfeld.herokuapp.com/en/articles/ruby-rails/2012-03-04-how-nokogiri-and-jruby-saved-my-week>


Even in single threaded code, JRuby often performs better.

That was not the case in my experiments. Nokogiri was able to perform a single operation a little faster with Ruby 1.9.3 than with JRuby and it would probably finish my task on MRI 1.9.3 faster than JRuby if I was to use a single thread.

I'm not arguing that threads can't increase performance. But a multithreaded app is much more difficult to debug than a multiprocess app, in my experience.

It can surely be. But this isn't always true. And there is a hole science on creating more reliable multi-threaded programs. I've done lots of multi-threading programming in my old days of C and C++ and most of the times working with them was a breeze.

It mostly depends on the system design and how much lock-dependency you have.

You'll see lots of thread-safe Java libraries out there and most of them will simplify the lock-logic by using "syncrhonized" methods and that is ok. They may have not the best performance for all situations but the code is really to understand and the performance is usually good enough.

I've been creating a lot of multi-threaded code in the last decade and I find it more blessing than dangerous.

--
You received this message because you are subscribed to the Google Groups 
"sequel-talk" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sequel-talk?hl=en.

Reply via email to