Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-21 Thread David Harkness
On Fri, Jan 21, 2011 at 4:44 AM, Dotan Cohen  wrote:

> Then I would have to check what values are available when inserting,
> and possibly normalise every so often. I'll think about that, and when
> I have enough data in the database I'll set up a test system to play
> with the possibility.
>

Yes, it's a much more complicated process, and one that I wouldn't enter
into without verifying there was a performance problem. The neat thing is
you could do this at any point as you can renumber the left and right values
at will--as long as you don't *reorder* them.


> I see, thanks. Good point about making sure that the problem exists
>
before trying to fix it, I've seen people optimise away where there is
> no bottleneck.
>

People who love to work on complicated problems often fall victim to
premature optimization because it's fun to do. I was guilty of this myself
earlier in my career, but as CPUs and memory and disk speeds increased over
the decades, it's harder to justify it. At least back then the optimization
that wasn't *necessary* still had a good payoff. Nowadays you can waste days
shaving 10ms off of an operation that takes 1s. Congratulations, you've just
poured $$ down the drain. But boy wasn't that fun! :)

Good luck with the implementation. I'd love to hear how it pans out once
you're done, and I'm sure the list would find it useful as well.

David


Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-21 Thread Dotan Cohen
On Fri, Jan 21, 2011 at 12:29, Richard Quadling  wrote:
> Changing data in a database is the role of the database engine. It is
> much more efficient to have the cost on the insert than it is on the
> select.
>

Agreed. On insert I could even delegate the operation to another
thread which does not timeout with the pageload.


> The adjacent list model is very expensive at n-levels for the select,
> but trivial cost for the insert. If you are inserting millions of rows
> but only occasionally looking at the data, then stick with the
> adjacent list model. But if tags and n-levels are regularly accessed
> and form a main part to the functionality of the app, then you may
> want to reconsider.
>

I've already reconsidered after some sleep and coffee!


> Sure, the insert for the nested set model is more expensive in terms
> of the number of rows to amend, but indexing will certainly should
> certainly help. If you have tools to help optimize the tag table and
> the queries you use, then I'd follow the recommendations (I use MS
> SQL, so my Query Optimization tools help me here). The nested set
> model is extremely efficient on the select.
>

Interesting. I am using MySQL for this application, but another hat I
wear is learning C# with MS tools and I will have to look into the
Query Optimisation.


> It is a trade off that you have to decide upon, based upon your data
> and needs. If, as I suspect, you are going to be doing a LOT of
> selects on the tags and (in the future) to multiple levels, then this
> aspect needs to be very efficient.
>

You suspect correctly.


> For me it is well worth the effort of moving from the adjacent list
> model to the nested set model.
>
> Both mechanisms work. In my opinion, the adjacent list model is for
> truly simply lookups, not for complicated n-levels.
>
> One of the changes I made to the nested set model was for a Bill Of
> Materials module. The client made complex machinery (industrial
> lathes). The sum quantity for all the parts were in the 20,000 region.
> Each nut, bolt, screw, etc. Many sub-assemblies. The parent/child node
> logic was massive in dealing with retrieving questions like "How many
> machines can we build?", "What stock do we need to buy/make to
> complete an order of 20 lathes?". Lot's of recursion into each level
> to build the list. Getting the results would take 3 or 4 minutes (this
> is in a non SQL environment using a peer-to-peer modified D-ISAM
> database - it was already slow because of all that). When I moved to
> the nested set model, no recursion and 1 query (more or less) and I
> have all the results I needed. It was seconds in comparison.
>

Thanks, I enjoy reading these real-life scenarios. This was a terrific example.


-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-21 Thread Dotan Cohen
> If you are doing this often, you could leave spaces in the left and right
> values so that you could minimize the number of rows that need to be
> updated. The article makes every leaf use x and x+1 for left and right which
> forces another update to add a child. If instead you used x and x+20 you'd
> leave space for more children without any updates. This could be applied
> from top to bottom, starting with the root category getting 0 and MAX_INT
> for its values.

Then I would have to check what values are available when inserting,
and possibly normalise every so often. I'll think about that, and when
I have enough data in the database I'll set up a test system to play
with the possibility.


> However, it's probably not even worth applying that complexity until you
> prove that frequent category additions are causing problems. Most systems
> will be querying against the categories table far more frequently, and
> that's where this model pays off. If you want to see all products in
> category X and its subcategories, it's a single *non-recursive* query.
> That's huge if you are doing a lot of searches like this.

You are right, that non-recursive bit is important. In fact, I think
that I'm convinced. Thanks!


>> But what a mess this would be if the two methods go out of sync!
>
> Sure, but these values would be maintained by your code--not end-users. It
> just comes down to making sure your code is correct through appropriate unit
> tests. By moving the logic to a stored procedure, you can ensure the table
> is locked during the updates to keep two users from adding a new category
> simultaneously.

So long as it is in fact my code, that's fine. But when others start
maintaining it and not reading comments, it may get ugly. That does
not apply to this particular pet project, but it is a consideration
for future projects.


>> That pays off more? For the guy writing code or for the database
>> memory requirement?
>
> Performance-wise. The nested set method looks to be moderately more complex
> code-wise, but luckily that is done just once while querying the database is
> done again and again. As with all optimizations, it's best to measure and
> make sure there's a problem before trying to solve it. Once you've built a
> few hierarchical systems, you'll be able to make a gut call up front.

I see, thanks. Good point about making sure that the problem exists
before trying to fix it, I've seen people optimise away where there is
no bottleneck.


>> Only two update statements, but they are affecting on average half the
>> database's rows!
>
> Of a single table: categories. Hopefully you have far more items that get
> categorized than you do categories.
>

True.

>> Which do you call the hierarchical model? That term is not used in the
>> linked article.
>
> Well, both models are hierarchical in the sense that there's a parent-child
> relationship. By hierarchical here I mean that the method of implementation
> involves each category pointing to its parent directly via a parent_id
> column. Searching for all subcategories of category X requires searching
> first for all children, then all grandchildren, and so on, resulting in a
> recursive query.
> Using the nested sets model requires a single non-recursive query to get the
> same data.
>

I do agree that the non-recursive method at retrieval time advantage
far outweighs the update-half-the-table issue upon addition of an
additional category.

Thanks!

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-21 Thread Dotan Cohen
>> Actually, I'm the customer! But assuming that a customer exists, that
>> implies compensation, and therefore fair bait.
> Then that's different altogether. you get to decide what information
> is displayed, and what information is 'sensed', and on what platform.
>

Yes, but before I get to that stage (relatively easy today with high
level languages such as PHP or C#) I need to decide how to organise
the data.

> What do you want to sense and what do you want to display(not to say
> I'm an expert, but I like to think in CS)?
>

The application will pull calendar records by category: entertainment,
food, gov, transportation, etc. The idea is that the user could query
for, say, postal offices open after 17:00 on Thursday, or films
between 20:00 and 22:00 on Monday, or buses leaving Amsterdam to
Rotterdam on Monday morning.

By the way, I figured out what the "troll" issue was. I was confusing
your signature for the body of the post. It was familiar, too, so I
thought that it may have been a troll post on /. or such. It turns out
that I had seen it on the Python-tutor list, and their I also mistook
it for a troll as the length of the sig far outweighs the length of
your typically concise and to-the-point post. Constructive advice:
trim the sig!

Thanks!

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-21 Thread Richard Quadling
On 21 January 2011 05:34, Paul M Foster  wrote:
> On Thu, Jan 20, 2011 at 12:05:53PM -0800, David Harkness wrote:
>
>> On Thu, Jan 20, 2011 at 7:00 AM, Richard Quadling wrote:
>>
>> > I'd recommend using a nested set approach for the tags
>> > (http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
>> > gives a good explanation on the issues and methodology of nested
>> > sets).
>> >
>>
>> Thanks for the link. That article proposes an interesting way to organize
>> the categories. Have you implemented this in the wild? Clearly the design
>> would work as it's pretty simple, and I like that it removes the need for
>> recursive queries.
>>
>> Dotan, the Venn diagrams are just used to explain the concept. If you use
>> the code to determine the left and right values, you can ignore the diagrams
>> entirely. As long as you're not adding/removing categories every minute,
>> having to recalculate left and right values isn't that big of a deal.
>
> For the curious, here is a different article which explains the same
> concept:
>
> http://articles.sitepoint.com/article/hierarchical-data-database
>
> Paul

Nice article. Sitepoint has a lot of useful articles.

And as the conclusion says ...

"You're now familiar with both ways to store trees in a database.
While I have a slight preference for the modified preorder tree
traversal, in your particular situation the adjacency list method
might be better. I'll leave that to your own judgement."

So, basically, Dotan, you have to do some analysis to determine which
solution is best for you. But as soon as you say "multiple levels" I
think nested sets. If you are going to enforce just 1 or 2 levels,
then MAYBE adjacent list is the way for you.

Richard.

-- 
Richard Quadling
Twitter : EE : Zend
@RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-21 Thread Richard Quadling
On 20 January 2011 19:20, Dotan Cohen  wrote:
> On Thu, Jan 20, 2011 at 19:21, Richard Quadling  wrote:
>>> That is terrific, at least the first half. The second half, with the
>>> Venn diagrams, is awkward!
>>
>> When you get heavily nested data, the adjacent list model (where you
>> have a parentid for every uniqueid), you very quickly get into
>> complicated logic trying to traverse n-levels. The nested set model is
>> specifically built to handle this issue. I'd recommend getting to
>> grips with it. It will make finding items belonging to a group (or a
>> super group) a LOT easier.
>>
>> Especially if you have multiple tag hierarchies.
>>
>
> Is that strategy widely deployed, then? It seems so unruly having to
> change on average half the database records for every new leaf.
>
>
> --
> Dotan Cohen
>
> http://gibberish.co.il
> http://what-is-what.com
>

Changing data in a database is the role of the database engine. It is
much more efficient to have the cost on the insert than it is on the
select.

The adjacent list model is very expensive at n-levels for the select,
but trivial cost for the insert. If you are inserting millions of rows
but only occasionally looking at the data, then stick with the
adjacent list model. But if tags and n-levels are regularly accessed
and form a main part to the functionality of the app, then you may
want to reconsider.

Sure, the insert for the nested set model is more expensive in terms
of the number of rows to amend, but indexing will certainly should
certainly help. If you have tools to help optimize the tag table and
the queries you use, then I'd follow the recommendations (I use MS
SQL, so my Query Optimization tools help me here). The nested set
model is extremely efficient on the select.

It is a trade off that you have to decide upon, based upon your data
and needs. If, as I suspect, you are going to be doing a LOT of
selects on the tags and (in the future) to multiple levels, then this
aspect needs to be very efficient.

For me it is well worth the effort of moving from the adjacent list
model to the nested set model.

Both mechanisms work. In my opinion, the adjacent list model is for
truly simply lookups, not for complicated n-levels.

One of the changes I made to the nested set model was for a Bill Of
Materials module. The client made complex machinery (industrial
lathes). The sum quantity for all the parts were in the 20,000 region.
Each nut, bolt, screw, etc. Many sub-assemblies. The parent/child node
logic was massive in dealing with retrieving questions like "How many
machines can we build?", "What stock do we need to buy/make to
complete an order of 20 lathes?". Lot's of recursion into each level
to build the list. Getting the results would take 3 or 4 minutes (this
is in a non SQL environment using a peer-to-peer modified D-ISAM
database - it was already slow because of all that). When I moved to
the nested set model, no recursion and 1 query (more or less) and I
have all the results I needed. It was seconds in comparison.

Regards,

Richard.
-- 
Richard Quadling
Twitter : EE : Zend
@RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-20 Thread Paul M Foster
On Thu, Jan 20, 2011 at 12:05:53PM -0800, David Harkness wrote:

> On Thu, Jan 20, 2011 at 7:00 AM, Richard Quadling wrote:
> 
> > I'd recommend using a nested set approach for the tags
> > (http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
> > gives a good explanation on the issues and methodology of nested
> > sets).
> >
> 
> Thanks for the link. That article proposes an interesting way to organize
> the categories. Have you implemented this in the wild? Clearly the design
> would work as it's pretty simple, and I like that it removes the need for
> recursive queries.
> 
> Dotan, the Venn diagrams are just used to explain the concept. If you use
> the code to determine the left and right values, you can ignore the diagrams
> entirely. As long as you're not adding/removing categories every minute,
> having to recalculate left and right values isn't that big of a deal.

For the curious, here is a different article which explains the same
concept:

http://articles.sitepoint.com/article/hierarchical-data-database

Paul

-- 
Paul M. Foster
http://noferblatz.com


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-20 Thread David Harkness
On Thu, Jan 20, 2011 at 12:21 PM, Dotan Cohen  wrote:

> I understood that. My concern is exactly with adding new nodes. There
> is no incrementor (++i) in SQL, so knowingly coding a solution that
> will require incrementing two fields in half the database rows seems
> irresponsible.
>

It only requires updating the category rows. If you have several hundred
categories this is a non-issue. If you have several thousand categories, you
probably have millions of products, and you'd want to do some performance
analysis on it. Even still, this is necessary only when adding new
categories.

If you are doing this often, you could leave spaces in the left and right
values so that you could minimize the number of rows that need to be
updated. The article makes every leaf use x and x+1 for left and right which
forces another update to add a child. If instead you used x and x+20 you'd
leave space for more children without any updates. This could be applied
from top to bottom, starting with the root category getting 0 and MAX_INT
for its values.

However, it's probably not even worth applying that complexity until you
prove that frequent category additions are causing problems. Most systems
will be querying against the categories table far more frequently, and
that's where this model pays off. If you want to see all products in
category X and its subcategories, it's a single *non-recursive* query.
That's huge if you are doing a lot of searches like this.

But what a mess this would be if the two methods go out of sync!
>

Sure, but these values would be maintained by your code--not end-users. It
just comes down to making sure your code is correct through appropriate unit
tests. By moving the logic to a stored procedure, you can ensure the table
is locked during the updates to keep two users from adding a new category
simultaneously.

That pays off more? For the guy writing code or for the database
> memory requirement?
>

Performance-wise. The nested set method looks to be moderately more complex
code-wise, but luckily that is done just once while querying the database is
done again and again. As with all optimizations, it's best to measure and
make sure there's a problem before trying to solve it. Once you've built a
few hierarchical systems, you'll be able to make a gut call up front.

Only two update statements, but they are affecting on average half the
> database's rows!
>

Of a single table: categories. Hopefully you have far more items that get
categorized than you do categories.


> Which do you call the hierarchical model? That term is not used in the
> linked article.
>

Well, both models are hierarchical in the sense that there's a parent-child
relationship. By hierarchical here I mean that the method of implementation
involves each category pointing to its parent directly via a parent_id
column. Searching for all subcategories of category X requires searching
first for all children, then all grandchildren, and so on, resulting in a
recursive query.

Using the nested sets model requires a single non-recursive query to get the
same data.

David


Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-20 Thread Dotan Cohen
On Thu, Jan 20, 2011 at 22:05, David Harkness  wrote:
> Thanks for the link. That article proposes an interesting way to organize
> the categories. Have you implemented this in the wild? Clearly the design
> would work as it's pretty simple, and I like that it removes the need for
> recursive queries.

I am also interested in knowing if this approach is used in any production code.


> Dotan, the Venn diagrams are just used to explain the concept. If you use
> the code to determine the left and right values, you can ignore the diagrams
> entirely. As long as you're not adding/removing categories every minute,
> having to recalculate left and right values isn't that big of a deal.

I understood that. My concern is exactly with adding new nodes. There
is no incrementor (++i) in SQL, so knowingly coding a solution that
will require incrementing two fields in half the database rows seems
irresponsible.


> Also, there's no reason you couldn't keep the parent_id field with the
> nested sets. It would come in handy for certain types of queries, though
> it's not necessary.

That is true. I could store both methods, and experiment to see which
is preferable. But what a mess this would be if the two methods go out
of sync! Isn't there a name for that in SQL, something along the lines
of not storing the same data in two places lest one should change and
not the other? The term escapes me.


>> I disagree. The method I proposed can be extended to any depth, and any
>> leaf or branch can be retrieved with a single query.
>
> The nested set method can be extended to any depth, and it pays off more the
> larger the hierarchy grows. While you can retrieve any branch (all
> ancestors) of a node with a single SQL query, the SQL engine itself actually
> must perform a recursive query meaning multiple hits on the parent_id index.

That pays off more? For the guy writing code or for the database
memory requirement?


>> I suppose for retrievals this structure has advantages, but unless
>> MySQL has a ++ operator (or better yet, one that adds or subtracts 2
>> from an int) then it looks to be a pain to add nodes.
>
> ++ or += wouldn't be any better here than x = x + 2. Once you're modifying
> indexed values, you'll pay a much higher price writing to disk than += could
> ever save you in CPU cycles. The beauty is that inserting a node requires
> only two update statements that will fix *all* categories that need to be
> adjusted.

Only two update statements, but they are affecting on average half the
database's rows!


> Adding categories to the hierarchical model is definitely faster
> so it comes down to your insert-to-select ratio. Moving a subtree is also
> much easier with the hierarchical model.

Which do you call the hierarchical model? That term is not used in the
linked article.


-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-20 Thread David Harkness
On Thu, Jan 20, 2011 at 7:00 AM, Richard Quadling wrote:

> I'd recommend using a nested set approach for the tags
> (http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
> gives a good explanation on the issues and methodology of nested
> sets).
>

Thanks for the link. That article proposes an interesting way to organize
the categories. Have you implemented this in the wild? Clearly the design
would work as it's pretty simple, and I like that it removes the need for
recursive queries.

Dotan, the Venn diagrams are just used to explain the concept. If you use
the code to determine the left and right values, you can ignore the diagrams
entirely. As long as you're not adding/removing categories every minute,
having to recalculate left and right values isn't that big of a deal.

Also, there's no reason you couldn't keep the parent_id field with the
nested sets. It would come in handy for certain types of queries, though
it's not necessary.

On Thu, Jan 20, 2011 at 11:40 AM, Jerry Schwartz  wrote:

> I disagree. The method I proposed can be extended to any depth, and any
> leaf or branch can be retrieved with a single query.
>

The nested set method can be extended to any depth, and it pays off more the
larger the hierarchy grows. While you can retrieve any branch (all
ancestors) of a node with a single SQL query, the SQL engine itself actually
must perform a recursive query meaning multiple hits on the parent_id index.

On Thu, Jan 20, 2011 at 11:59 AM, Dotan Cohen  wrote:

> I suppose for retrievals this structure has advantages, but unless
> MySQL has a ++ operator (or better yet, one that adds or subtracts 2
> from an int) then it looks to be a pain to add nodes.
>

++ or += wouldn't be any better here than x = x + 2. Once you're modifying
indexed values, you'll pay a much higher price writing to disk than += could
ever save you in CPU cycles. The beauty is that inserting a node requires
only two update statements that will fix *all* categories that need to be
adjusted. Adding categories to the hierarchical model is definitely faster
so it comes down to your insert-to-select ratio. Moving a subtree is also
much easier with the hierarchical model.

David


Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-20 Thread David Hutto
On Thu, Jan 20, 2011 at 2:26 PM, Dotan Cohen  wrote:
> On Thu, Jan 20, 2011 at 21:24, David Hutto  wrote:
>>> Is this a troll? Am I about to be baited?
>>
>> Baited to deploy what is designed to the consumer's specification?
>> Surely. From what is wanted to what is needed. Troll on that.
>
> Actually, I'm the customer! But assuming that a customer exists, that
> implies compensation, and therefore fair bait.
Then that's different altogether. you get to decide what information
is displayed, and what information is 'sensed', and on what platform.

What do you want to sense and what do you want to display(not to say
I'm an expert, but I like to think in CS)?

>
> --
> Dotan Cohen
>
> http://gibberish.co.il
> http://what-is-what.com
>



-- 
The lawyer in me says argue...even if you're wrong. The scientist in
me... says shut up, listen, and then argue. But the lawyer won on
appeal, so now I have to argue due to a court order.

Furthermore, if you could be a scientific celebrity, would you want
einstein sitting around with you on saturday morning, while you're
sitting in your undies, watching Underdog?...Or better yet, would
Einstein want you to violate his Underdog time?

Can you imagine Einstein sitting around in his underware? Thinking
about the relativity between his pubic nardsac, and his Fruit of the
Looms, while knocking a few Dorito's crumbs off his inner brilliant
white thighs, and hailing E = mc**2, and licking the orangy,
delicious, Doritoey crust that layered his genetically rippled
fingertips?

But then again, J. Edgar Hoover would want his pantyhose intertwined
within the equation.

However, I digress, momentarily.

But Einstein gave freely, for humanity, not for gain, other than
personal freedom.

An equation that benefited all, and yet gain is a personal product.

Also, if you can answer it, is gravity anymore than interplanetary static cling?

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-20 Thread Dotan Cohen
On Thu, Jan 20, 2011 at 21:24, David Hutto  wrote:
>> Is this a troll? Am I about to be baited?
>
> Baited to deploy what is designed to the consumer's specification?
> Surely. From what is wanted to what is needed. Troll on that.

Actually, I'm the customer! But assuming that a customer exists, that
implies compensation, and therefore fair bait.

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-20 Thread David Hutto
> Is this a troll? Am I about to be baited?

Baited to deploy what is designed to the consumer's specification?
Surely. From what is wanted to what is needed. Troll on that.
>
> --
> Dotan Cohen
>
> http://gibberish.co.il
> http://what-is-what.com
>



-- 
The lawyer in me says argue...even if you're wrong. The scientist in
me... says shut up, listen, and then argue. But the lawyer won on
appeal, so now I have to argue due to a court order.

Furthermore, if you could be a scientific celebrity, would you want
einstein sitting around with you on saturday morning, while you're
sitting in your undies, watching Underdog?...Or better yet, would
Einstein want you to violate his Underdog time?

Can you imagine Einstein sitting around in his underware? Thinking
about the relativity between his pubic nardsac, and his Fruit of the
Looms, while knocking a few Dorito's crumbs off his inner brilliant
white thighs, and hailing E = mc**2, and licking the orangy,
delicious, Doritoey crust that layered his genetically rippled
fingertips?

But then again, J. Edgar Hoover would want his pantyhose intertwined
within the equation.

However, I digress, momentarily.

But Einstein gave freely, for humanity, not for gain, other than
personal freedom.

An equation that benefited all, and yet gain is a personal product.

Also, if you can answer it, is gravity anymore than interplanetary static cling?

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-20 Thread Dotan Cohen
On Thu, Jan 20, 2011 at 20:50, David Hutto  wrote:
> Pseudo = Design Algorithm
> Design Algorithm = Actual Code
> Actual Code = Alterable db tables
> Alterable db tables = manipulated data through the app interface with data
>
> --
> The lawyer in me says argue...even if you're wrong. The scientist in
> me... says shut up, listen, and then argue. But the lawyer won on
> appeal, so now I have to argue due to a court order.
>
> Furthermore, if you could be a scientific celebrity, would you want
> einstein sitting around with you on saturday morning, while you're
> sitting in your undies, watching Underdog?...Or better yet, would
> Einstein want you to violate his Underdog time?
>
> Can you imagine Einstein sitting around in his underware? Thinking
> about the relativity between his pubic nardsac, and his Fruit of the
> Looms, while knocking a few Dorito's crumbs off his inner brilliant
> white thighs, and hailing E = mc**2, and licking the orangy,
> delicious, Doritoey crust that layered his genetically rippled
> fingertips?
>
> But then again, J. Edgar Hoover would want his pantyhose intertwined
> within the equation.
>
> However, I digress, momentarily.
>
> But Einstein gave freely, for humanity, not for gain, other than
> personal freedom.
>
> An equation that benefited all, and yet gain is a personal product.
>
> Also, if you can answer it, is gravity anymore than interplanetary static 
> cling?
>

Is this a troll? Am I about to be baited?

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-20 Thread Dotan Cohen
On Thu, Jan 20, 2011 at 19:21, Richard Quadling  wrote:
>> That is terrific, at least the first half. The second half, with the
>> Venn diagrams, is awkward!
>
> When you get heavily nested data, the adjacent set model (where you
> have a parentid for every uniqueid), you very quickly get into
> complicated logic trying to traverse n-levels. The nested set model is
> specifically built to handle this issue. I'd recommend getting to
> grips with it. It will make finding items belonging to a group (or a
> super group) a LOT easier.
>
> Especially if you have multiple tag hierarchies.
>

Is that strategy widely deployed, then? It seems so unruly having to
change on average half the database records for every new leaf.


-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-20 Thread David Hutto
Pseudo = Design Algorithm
Design Algorithm = Actual Code
Actual Code = Alterable db tables
Alterable db tables = manipulated data through the app interface with data

-- 
The lawyer in me says argue...even if you're wrong. The scientist in
me... says shut up, listen, and then argue. But the lawyer won on
appeal, so now I have to argue due to a court order.

Furthermore, if you could be a scientific celebrity, would you want
einstein sitting around with you on saturday morning, while you're
sitting in your undies, watching Underdog?...Or better yet, would
Einstein want you to violate his Underdog time?

Can you imagine Einstein sitting around in his underware? Thinking
about the relativity between his pubic nardsac, and his Fruit of the
Looms, while knocking a few Dorito's crumbs off his inner brilliant
white thighs, and hailing E = mc**2, and licking the orangy,
delicious, Doritoey crust that layered his genetically rippled
fingertips?

But then again, J. Edgar Hoover would want his pantyhose intertwined
within the equation.

However, I digress, momentarily.

But Einstein gave freely, for humanity, not for gain, other than
personal freedom.

An equation that benefited all, and yet gain is a personal product.

Also, if you can answer it, is gravity anymore than interplanetary static cling?

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-20 Thread David Harkness
I cannot agree more with the others about using a join table. While it's
tempting to go with your first solution due to fear of performance issues,
you can usually address performance issues with a technical solution.
Addressing problems that arise from a constraining design choice is much
more difficult.

David


Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-20 Thread Richard Quadling
On 20 January 2011 16:20, Dotan Cohen  wrote:
> On Thu, Jan 20, 2011 at 17:00, Richard Quadling  wrote:
>> I'd have my items table, my tags table and a join table for the two.
>> My join table is really simple. UniqueID, ItemID, TagID.
>>
>
> Yes, that is the first approach that I mentioned. It looks to be a
> good compromise.
>
>
>> I'd recommend using a nested set approach for the tags
>> (http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
>> gives a good explanation on the issues and methodology of nested
>> sets).
>>
>
> That is terrific, at least the first half. The second half, with the
> Venn diagrams, is awkward!
>
> --
> Dotan Cohen
>
> http://gibberish.co.il
> http://what-is-what.com
>

When you get heavily nested data, the adjacent set model (where you
have a parentid for every uniqueid), you very quickly get into
complicated logic trying to traverse n-levels. The nested set model is
specifically built to handle this issue. I'd recommend getting to
grips with it. It will make finding items belonging to a group (or a
super group) a LOT easier.

Especially if you have multiple tag hierarchies.





-- 
Richard Quadling
Twitter : EE : Zend
@RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-20 Thread Dotan Cohen
On Thu, Jan 20, 2011 at 18:20, Dotan Cohen  wrote:
> On Thu, Jan 20, 2011 at 17:00, Richard Quadling  wrote:
>> I'd have my items table, my tags table and a join table for the two.
>> My join table is really simple. UniqueID, ItemID, TagID.
>>
>
> Yes, that is the first approach that I mentioned. It looks to be a
> good compromise.
>
>

Sorry, that was _not_ the first approach that I mentioned. It looks to
be the right method though, thanks!

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-20 Thread Dotan Cohen
On Thu, Jan 20, 2011 at 17:00, Richard Quadling  wrote:
> I'd have my items table, my tags table and a join table for the two.
> My join table is really simple. UniqueID, ItemID, TagID.
>

Yes, that is the first approach that I mentioned. It looks to be a
good compromise.


> I'd recommend using a nested set approach for the tags
> (http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
> gives a good explanation on the issues and methodology of nested
> sets).
>

That is terrific, at least the first half. The second half, with the
Venn diagrams, is awkward!

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Organisational question: surely someone has implemented many Boolean values (tags) and a solution exist

2011-01-20 Thread Richard Quadling
On 20 January 2011 14:32, Dotan Cohen  wrote:
> I am designing an application that make heavy usage of one-to-many
> tags for items. That is, each item can have multiple tags, and there
> are tens of tags (likely to grow to hundreds). Most operation on the
> database are expected to be searches for the items that have a
> particular tag. That is, users will search per tags, not per items.
>
> These are the ways that I've thought about storing the tags, some bad
> and some worse. If there is a better way I'd love to know.
>
> 1) Each item will get a row in a "tags" table, with a column for each tag.
> mysql> CREATE TABLE tags (
>    id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
>    item VARCHAR(100),
>    tag1 bool,
>    tag2 bool,
>    
>    tagN bool
> );
>
> With this approach I would be adding a new column every time a new
> category is added. This looks to me a good way given that users will
> be searching per tag and a simple "SELECT item FROM tags WHERE
> tag1=true;" is an easy, inexpensive query. This table will get very
> large, there will likely be literally thousands of items (there will
> exist more items than tags).
>
>
>
> 2) Store the applicable tags one per line in a text field in the items table.
> mysql> CREATE TABLE items (
>    id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
>    item VARCHAR(100),
>    tags text,
> );
>
> This looks like a bad idea, searching by tag will be a mess.
>
>
>
> 3) Store the tags in a table and add items to a text field. For instance:
> mysql> CREATE TABLE tags (
>    id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
>    tagName VARCHAR(100),
>    items text,
> );
>
> This looks to be the best way from a MySQL data retrieval perspective,
> but I do not know how expensive it will be to then split the items in
> PHP. Furthermore, adding items to tags could get real expensive.
>
>
>
> Caveat: at some point in the future there may be added the ability to
> have a tag hierarchy. For instance, there could exist a tag
> "restaurant" that will get the subtags "italian" and "french". I could
> fake this with any approach by having a table of existing tags with a
> "parentTag" field, so if I plan on having this table anyway would
> method 3 above be preferable?
>
> Note: this message is cross-posted to the MySQL and the PHP lists as I
> am really not sure where is the best place to do the logic. My
> apologies to those who receive the message twice.
>
> Thanks!
>
> --
> Dotan Cohen
>
> http://gibberish.co.il
> http://what-is-what.com
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

I'd have my items table, my tags table and a join table for the two.
My join table is really simple. UniqueID, ItemID, TagID.

I'd recommend using a nested set approach for the tags
(http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
gives a good explanation on the issues and methodology of nested
sets).



-- 
Richard Quadling
Twitter : EE : Zend
@RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php