On 27/10/2014 5:34 PM, "Asi Sudai" <[email protected]> wrote:
>>
>> But what kind of problems were you actually facing?
>
> I'll admit it's driven more by curiosity and "philosophy" of separating
components more than actual need to overcome a problem.
>
> since sqlalchemy gives an easy framework to do so, it's not very
complicated and all the sharding logic is stored in the backend, so the
front end tools aren't aware of this at all.
>
> by sharding the database I hope we could:
> - easier to backup and restore ( much smaller set of data )
> - easier to archive together with the entire project files ( cause that
DB is isolated from the rest of the projects data )

I would think this is only a concern if you are generating massive amounts
of data and find a performance benefit in being able to remove chunks of
data from being queried. Then again, if your project fields are indexed
anyways, wouldn't queries still be fast when they include a filter on the
project?

> - set permissions based on a project/shard

Permissions could be part of the application layer. I'm not sure how
sharding the database makes this any easier. If user X is not allowed to
access project Y, then you can fail that request at the application level.

> - scalability where each shard could live on a different server or
location (if needed)

Or you can just try to improve performance with replicated slaves? Unless
your bottleneck is writes, but even then you should be able to accomplish
something with replicas.

>
> did anyone try/needed that in a production environment?
>

As mentioned above, I'm pretty sure these problems can either be addressed
without partitioning across multiple databases, at the application layer.
Or, some of them aren't actually a problem. Like you said, this is
currently driven by "curiosity and a philosophy", so it doesn't sound like
you are specifically trying to solve a current problem you are facing.

> On Wednesday, 1 October 2014 14:36:38 UTC-7, Justin Israel wrote:
>>
>> But what kind of problems were you actually facing, if I might ask? Did
you encounter problems with table locking? If so, is your usage profile
write heavy and are you using MyISAM (table locking) vs InnoDB (row
locking)? Were your major pain points the archive process? Is your dataset
unbelievably large and causing slow queries?
>>
>> Just curious what is prompting the investigation into partitioning.
Also, it looks like Postgres supports table partitioning:
>> http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html
>>
>>
>>
>> On Thu, Oct 2, 2014 at 10:09 AM, Asi Sudai <[email protected]> wrote:
>>>
>>> I'm using a recent release of MySql with sqlAlchemy as the ORM... but
could move to a different sql if there's a better one if needed.
>>>
>>> Since I never done this before, we started prototyping the database and
the code side, to see how complex sharding would be.
>>> the first issue that came up is the ID, it's not unique, there are many
Asset with ID 1 for example.
>>> So the way around that is to create a truly unique_id across all the
databases. so the Asset ID is 1 but the uniqueID is: somethinglong001101
>>>
>>> the next step would be the code side, how can we hide the complexity of
the sharding from the tools and pipeline, we don't want them to "know"
about it...
>>>
>>> I hope this prototype will show if it's worth it or not...
>>> worth it or not, I guess it comes down to is the sharding gives us
flexible but at the same time the complexity is hidden from our day to day
tools/code :)
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Monday, 29 September 2014 18:47:56 UTC-7, Justin Israel wrote:
>>>>
>>>> You are referring to this generic wiki about partitioned data, but are
you specifically talking about MySQL as your target? The reason I ask is
because some databases have built in concepts of sharding and routing. And
also some databases may use row locking instead of table locking. Maybe it
is best to investigate the right database for your needs before looking at
trying to implement a manual mysql partitioning abstraction?
>>>>
>>>> I agree about some of the advantages, but I wonder if they outweigh
the disadvantages. It would mean that each time you create a new project, a
new database must be created before applications will be able to function.
And you have to maintain your own routing logic, which may or may not be a
big deal at all.
>>>>
>>>> Has it proven difficult to extract a single projects data set for
backups, just using queries? And is your project dataset so large that it
needs to be off-lined with each final project?
>>>>
>>>> hey :)
>>>>
>>>> Question what you think/know about Horizontal Partitioning of each
project data into a "shard" database?
>>>> distributed project database in a sense that all the project have the
same scheme But they have separated database.
>>>> http://en.wikipedia.org/wiki/Partition_(database)
>>>>
>>>> for example projectA have tables (Shot, Asset) and projectB have the
same tables with the same fields,
>>>> but projectA and projectB are on different database or even servers.
>>>>
>>>> And using a common lookup project table, you can find which project
points to which database, for you code to work with
>>>>
>>>> Why we're thinking about it?
>>>> having each project individual database have some advantages:
>>>> - easier to backup and restore ( much smaller set of data )
>>>> - easier to archive together with the entire project files ( cause
that DB is isolated from the rest of the projects data )
>>>> - less table locking wait time, cause each project have different
database and tables
>>>>
>>>> badness coming from this?
>>>> - complexity ... each item needs a unique_id across all the tables,
something like ( projectID_ID )
>>>> - complexity ... where do you store common tables ( Users for example )
>>>> - you can't do "join" across databases... do get all artists Tasks
across shows become multiple queries, one per project
>>>>
>>>> any thoughts?
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
Groups "Python Programming for Autodesk Maya" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected].
>>>> To view this discussion on the web visit
https://groups.google.com/d/msgid/python_inside_maya/6dbde745-cfdf-4609-ab09-714f99d37ba4%40googlegroups.com
.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
Groups "Python Programming for Autodesk Maya" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected].
>>> To view this discussion on the web visit
https://groups.google.com/d/msgid/python_inside_maya/f3a798ff-1d8f-41da-bd91-9017c522f4bf%40googlegroups.com
.
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>
>>
> --
> You received this message because you are subscribed to the Google Groups
"Python Programming for Autodesk Maya" group.
> To unsubscribe from this group and stop receiving emails from it, send an
email to [email protected].
> To view this discussion on the web visit
https://groups.google.com/d/msgid/python_inside_maya/eea91bfb-ca17-404a-839b-ab29d1f6319c%40googlegroups.com
.
>
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/python_inside_maya/CAPGFgA06iO%2BAsiK_tj4XSbTg06X9qxUPV2eFD4QQ8CsRvmSVjA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to